Thread: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
[PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 17 March 2013 05:19, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brendan Jurd <direvus@gmail.com> writes: >> On 16 March 2013 09:07, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> The thing is that that syntax creates an array of zero dimensions, >>> not one that has 1 dimension and zero elements. > >> I'm going to ask the question that immediately comes to mind: Is there >> anything good at all about being able to define a zero-dimensional >> array? > > Perhaps not. I think for most uses, a 1-D zero-length array would be > just as good. I guess what I'd want to know is whether we also need > to support higher-dimensional zero-size arrays, and if so, what does > the I/O syntax for those look like? > > Another fly in the ointment is that if we do redefine '{}' as meaning > something other than a zero-D array, how will we handle existing > database entries that are zero-D arrays? > Hello hackers, I submit a patch to rectify the weird and confusing quirk of Postgres to use "zero dimensions" to signify an empty array. Rationale: The concept of an array without dimensions is a) incoherent, and b) leads to astonishing results from our suite of user-facing array functions. array_length, array_dims, array_ndims, array_lower and array_upper all return NULL when presented such an array. When a user writes ARRAY[]::int[], they expect to get an empty array, but instead we've been giving them a bizarre semi-functional proto-array. Not cool. Approach: The patch teaches postgres to regard zero as an invalid number of dimensions (which it very much is), and allows instead for fully armed and operational empty arrays. The simplest form of empty array is the 1-D empty array (ARRAY[] or '{}') but the patch also allows for empty arrays over multiple dimensions, meaning that the final dimension has a length of zero, but the other dimensions may have any valid length. For example, '{{},{},{}}' is a 2-D empty array with dimension lengths {3,0} and dimension bounds [1:3][1:0]. An empty array dimension may have any valid lower bound, but by default the lower bound will be 1 and the upper bound will therefore be 0. Any zero-D arrays read in from existing database entries will be output as 1-D empty arrays from array_recv. There is an existing limitation where you cannot extend a multi-D array by assigning values to subscripts or slices. I've made no attempt to address that limitation. The patch improves some error reporting, particularly by adding errdetail messages for ereports of problems parsing a curly-brace array literal. There is some miscellaneous code cleanup included; I moved the hardcoded characters '{', '}', etc. into constants, unwound a superfluous inner loop from ArrayCount, and corrected some mistakes in code comments and error texts. If preferred for reviewing, I can split those changes (and/or the new errdetails) out into a separate patch. Incompatibility: This patch introduces an incompatible change in the behaviour of the aforementioned array functions -- instead of returning NULL for empty arrays they return meaningful values. Applications relying on the old behaviour to test for emptiness may be disrupted. One can future-proof against this by changing e.g. IF array_length(arr, 1) IS NULL ... to IF coalesce(array_length(arr, 1), 0) = 0 ... There is also a change in the way array subscript assignment works. Previously it wasn't possible to make a distinction between assigning to an empty array and assigning to a NULL, so in either case an expression like "arr[4] := 1" would create "[4:4]={1}". Now there is a distinction. If you assign to an empty array you will get "{NULL, NULL, NULL, 1}", whereas if you assign to a NULL you'll get "[4:4]={1}". Regression tests: The regression tests were updated to reflect behavioural changes. Documentation: A minor update to the prose in chapter 8.15 is included in the patch. Issues: Fixed-length arrays (like oidvector) are zero-based, which means that they are rendered into string form with their dimension info shown. So an empty oidvector now renders as "[0:-1]={}", which is correct but ugly. I'm not delighted with this side-effect but I'm not sure what can be done about it. I'm open to ideas. Diffstat: doc/src/sgml/array.sgml | 4 +- src/backend/executor/execQual.c | 77 +++++- src/backend/utils/adt/array_userfuncs.c | 23 +- src/backend/utils/adt/arrayfuncs.c | 671 +++++++++++++++++++++++++---------------------- src/backend/utils/adt/arrayutils.c | 4 + src/backend/utils/adt/xml.c | 19 +- src/include/c.h | 1 + src/include/utils/array.h | 4 + src/pl/plpython/plpy_typeio.c | 3 - src/test/isolation/expected/timeouts.out | 16 +- src/test/isolation/specs/timeouts.spec | 8 +- src/test/regress/expected/arrays.out | 55 ++-- src/test/regress/expected/create_function_3.out | 2 +- src/test/regress/expected/polymorphism.out | 2 +- src/test/regress/sql/arrays.sql | 6 +- 15 files changed, 519 insertions(+), 376 deletions(-) I'll add this to the Commit Fest app. It is intended to resolve TODO item "Improve handling of empty arrays". Cheers, BJ
Attachment
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
"David E. Wheeler"
Date:
On Mar 20, 2013, at 4:45 PM, Brendan Jurd <direvus@gmail.com> wrote: > I submit a patch to rectify the weird and confusing quirk of Postgres > to use "zero dimensions" to signify an empty array. Epic. Thank you. I’m very glad now that I complained about this (again)! Best, David
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Josh Berkus
Date:
On 03/20/2013 04:45 PM, Brendan Jurd wrote: > Incompatibility: > This patch introduces an incompatible change in the behaviour of the > aforementioned array functions -- instead of returning NULL for empty > arrays they return meaningful values. Applications relying on the old > behaviour to test for emptiness may be disrupted. One can As a heavy user of arrays, I support this patch as being worth the breakage of backwards compatibility. However, that means it certainly will need to wait for 9.4 to provide adequate warning. Brendan, how hard would it be to create a GUC for backwards-compatible behavior? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 25 March 2013 13:02, Josh Berkus <josh@agliodbs.com> wrote: > On 03/20/2013 04:45 PM, Brendan Jurd wrote: >> Incompatibility: >> This patch introduces an incompatible change in the behaviour of the >> aforementioned array functions -- instead of returning NULL for empty >> arrays they return meaningful values. Applications relying on the old >> behaviour to test for emptiness may be disrupted. One can > > As a heavy user of arrays, I support this patch as being worth the > breakage of backwards compatibility. However, that means it certainly > will need to wait for 9.4 to provide adequate warning. Thanks Josh, I appreciate the support. > > Brendan, how hard would it be to create a GUC for backwards-compatible > behavior? Good idea. I don't know how hard it would be (never messed with GUCs before), but I'll take a swing at it and see how it works out. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Brendan Jurd <direvus@gmail.com> writes: > On 25 March 2013 13:02, Josh Berkus <josh@agliodbs.com> wrote: >> Brendan, how hard would it be to create a GUC for backwards-compatible >> behavior? > Good idea. No, it *isn't* a good idea. GUCs that change application-visible semantics are dangerous. We should have learned this lesson by now. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 26 March 2013 00:30, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brendan Jurd <direvus@gmail.com> writes: >> On 25 March 2013 13:02, Josh Berkus <josh@agliodbs.com> wrote: >>> Brendan, how hard would it be to create a GUC for backwards-compatible >>> behavior? > >> Good idea. > > No, it *isn't* a good idea. GUCs that change application-visible > semantics are dangerous. We should have learned this lesson by now. > They are? Well okay then. I'm not deeply attached to the GUC thing, it just seemed like a friendly way to ease the upgrade path. But if it's dangerous somehow I'm happy to drop it. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Brendan Jurd <direvus@gmail.com> writes: > On 26 March 2013 00:30, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> No, it *isn't* a good idea. GUCs that change application-visible >> semantics are dangerous. We should have learned this lesson by now. > They are? Yeah, they are, because things break when they're set wrong. Sometimes we have to have one anyway, but AFAICS this patch is just proposing to change corner cases that most apps don't hit. I think we'd be better off without a knob for it. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 21 March 2013 10:45, Brendan Jurd <direvus@gmail.com> wrote: > src/test/isolation/expected/timeouts.out | 16 +- > src/test/isolation/specs/timeouts.spec | 8 +- Oops, looks like some unrelated changes made their way into the original patch. Apologies. Here's a -v2 patch, sans stowaways. Cheers, BJ
Attachment
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Josh Berkus
Date:
Tom, > No, it *isn't* a good idea. GUCs that change application-visible > semantics are dangerous. We should have learned this lesson by now. Really? I thought that standard_conforming_strings was a great example of how to ease our users into a backwards-compatibility break. My thought was that we change the behavior in 9.4, provide a backwards-compatible GUC with warnings in the logs for two versions, and then take the GUC away. Whether that's worth the effort, though, is a question of how many people are going to be affected by the compatibility break. Maybe I can do a survey. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Greg Stark
Date:
On Mon, Mar 25, 2013 at 5:14 PM, Josh Berkus <josh@agliodbs.com> wrote: > Tom, > >> No, it *isn't* a good idea. GUCs that change application-visible >> semantics are dangerous. We should have learned this lesson by now. > > Really? I thought that standard_conforming_strings was a great example > of how to ease our users into a backwards-compatibility break. It is, but it was made in full knowledge of the risks and costs. What do you do if you're given two modules that have opposite expectations for this variable? One module or the other might have hidden unexpected bugs or even security holes if the variable is set in a way it doesn't expect. You have to pick one value for the whole database. I'm not as sanguine as Tom is about how likely these corner cases will be met actually. As far as I can tell checking IS NULL on array_length() was the supported way to check for 0-length arrays previously so I suspect a lot of applications will need to be updated. But it's not a hard update to do and is the kind of update that's often needed on major database upgrades. At least if it's a clean break then everyone on 9.3 knows they need to do IS NULL and everyone on 9.4 knows they need to check for 0. If there's a GUC then people need to code defensively without knowing which semantics they'll get. And if they test they need to test twice under both settings just in case the user's database has the other setting. -- greg
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 26 March 2013 05:26, Greg Stark <stark@mit.edu> wrote: > I'm not as sanguine as Tom is about how likely these corner cases will > be met actually. As far as I can tell checking IS NULL on > array_length() was the supported way to check for 0-length arrays > previously Correct. There was no other way to check for empty arrays, because every empty array was zero-D, and zero-D returns NULL from all array interrogation functions, even array_ndims (which is totally bonkers). > > At least if it's a clean break then everyone on 9.3 knows they need to > do IS NULL and everyone on 9.4 knows they need to check for 0. And, as I pointed out in at the top of the thread, you can write "coalesce(array_length(a,1), 0) = 0" if you want to be confident your code will continue work in either case, and for some folks I expect that will be the least painful way to upgrade from 9.3 -> 9.4. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Craig Ringer
Date:
On 03/25/2013 10:28 PM, Tom Lane wrote: > Yeah, they are, because things break when they're set wrong. They also make debugging and support harder; you need to get an ever-growing list of GUC values from the user to figure out what their query does. bytea_output, standard_conforming_strings, etc. Yick. That said, I don't have a better answer for introducing non-BC changes. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Darren Duncan
Date:
On 2013.03.25 5:55 PM, Craig Ringer wrote: > On 03/25/2013 10:28 PM, Tom Lane wrote: >> Yeah, they are, because things break when they're set wrong. > They also make debugging and support harder; you need to get an > ever-growing list of GUC values from the user to figure out what their > query does. bytea_output, standard_conforming_strings, etc. Yick. > > That said, I don't have a better answer for introducing non-BC changes. Given the general trouble GUC values cause, is there a plan to deprecate and remove each of the existing ones over time? As long as post-removal there isn't any actual loss of functionality, but users might have to change their code to do it "the one true way", that would seem a good thing. -- Darren Duncan
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Darren Duncan
Date:
On 2013.03.25 6:03 PM, Darren Duncan wrote: > On 2013.03.25 5:55 PM, Craig Ringer wrote: >> On 03/25/2013 10:28 PM, Tom Lane wrote: >>> Yeah, they are, because things break when they're set wrong. >> They also make debugging and support harder; you need to get an >> ever-growing list of GUC values from the user to figure out what their >> query does. bytea_output, standard_conforming_strings, etc. Yick. >> >> That said, I don't have a better answer for introducing non-BC changes. > > Given the general trouble GUC values cause, is there a plan to deprecate and > remove each of the existing ones over time? As long as post-removal there isn't > any actual loss of functionality, but users might have to change their code to > do it "the one true way", that would seem a good thing. -- Darren Duncan To clarify, I mean GUC related to backwards compatibility matters, such as bytea_output or standard_conforming_strings, things that affect the logical behavior of code. I don't mean all GUC, not at all, most of the ones I know about should remain configurable. -- Darren Duncan
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Bruce Momjian
Date:
On Mon, Mar 25, 2013 at 10:14:15AM -0700, Josh Berkus wrote: > Tom, > > > No, it *isn't* a good idea. GUCs that change application-visible > > semantics are dangerous. We should have learned this lesson by now. > > Really? I thought that standard_conforming_strings was a great example > of how to ease our users into a backwards-compatibility break. My > thought was that we change the behavior in 9.4, provide a > backwards-compatible GUC with warnings in the logs for two versions, and > then take the GUC away. standard_conforming_strings is not a good example because it took 5+ years to implement the change, and issued warnings about non-standard use for several releases --- it is not a pattern to follow. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes: > On Mon, Mar 25, 2013 at 10:14:15AM -0700, Josh Berkus wrote: >>> No, it *isn't* a good idea. GUCs that change application-visible >>> semantics are dangerous. We should have learned this lesson by now. >> Really? I thought that standard_conforming_strings was a great example >> of how to ease our users into a backwards-compatibility break. My >> thought was that we change the behavior in 9.4, provide a >> backwards-compatible GUC with warnings in the logs for two versions, and >> then take the GUC away. > standard_conforming_strings is not a good example because it took 5+ > years to implement the change, and issued warnings about non-standard > use for several releases --- it is not a pattern to follow. s_c_s was an example of the worst possible case: where the behavioral change not merely breaks applications, but breaks them in a way that creates easily-exploitable security holes. We *had* to take that one really slow, and issue warnings for several years beforehand (and IIRC, there were still gripes from people who complained that we'd caused them security problems). I can't imagine that we'd go to that kind of trouble for any less-sensitive behavioral change. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Robert Haas
Date:
On Sun, Mar 24, 2013 at 10:02 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 03/20/2013 04:45 PM, Brendan Jurd wrote: >> Incompatibility: >> This patch introduces an incompatible change in the behaviour of the >> aforementioned array functions -- instead of returning NULL for empty >> arrays they return meaningful values. Applications relying on the old >> behaviour to test for emptiness may be disrupted. One can > > As a heavy user of arrays, I support this patch as being worth the > breakage of backwards compatibility. However, that means it certainly > will need to wait for 9.4 to provide adequate warning. I expect to lose this argument, but I think this is a terrible idea. Users really hate it when they try to upgrade and find that they, uh, can't, because of some application-level incompatibility like this. They hate it twice as much when the change is essentially cosmetic. There's no functional problems with arrays as they exist today that this change would solve. The way to make a change like this without breaking things for users is to introduce a new type with different behavior and gradually deprecate the old one. Now, maybe it doesn't seem worth doing that for a change this small. But if so, I think that's evidence that this isn't worth changing in the first place, not that it's worth changing without regard for backwards-compatibility. Personally, I think if we're going to start whacking around the behavior here and risk inconveniencing our users, we ought to think a little larger. The fundamental thing that's dictating the current behavior is that we have arrays of between 1 and 6 dimensions all rolled up under a single data type. But in many cases, if not nearly all cases, what people want is, specifically, a one-dimensional array.If we were going to actually bite the bullet and createseparate data types for each possible number of array dimensions... and maybe fix some other problems at the same time... then the effort involved in ensuring backward-compatibility might seem like time better spent. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Pavel Stehule
Date:
2013/3/26 Robert Haas <robertmhaas@gmail.com>: > On Sun, Mar 24, 2013 at 10:02 PM, Josh Berkus <josh@agliodbs.com> wrote: >> On 03/20/2013 04:45 PM, Brendan Jurd wrote: >>> Incompatibility: >>> This patch introduces an incompatible change in the behaviour of the >>> aforementioned array functions -- instead of returning NULL for empty >>> arrays they return meaningful values. Applications relying on the old >>> behaviour to test for emptiness may be disrupted. One can >> >> As a heavy user of arrays, I support this patch as being worth the >> breakage of backwards compatibility. However, that means it certainly >> will need to wait for 9.4 to provide adequate warning. > > I expect to lose this argument, but I think this is a terrible idea. > Users really hate it when they try to upgrade and find that they, uh, > can't, because of some application-level incompatibility like this. > They hate it twice as much when the change is essentially cosmetic. > There's no functional problems with arrays as they exist today that > this change would solve. > > The way to make a change like this without breaking things for users > is to introduce a new type with different behavior and gradually > deprecate the old one. Now, maybe it doesn't seem worth doing that > for a change this small. But if so, I think that's evidence that this > isn't worth changing in the first place, not that it's worth changing > without regard for backwards-compatibility. > > Personally, I think if we're going to start whacking around the > behavior here and risk inconveniencing our users, we ought to think a > little larger. The fundamental thing that's dictating the current > behavior is that we have arrays of between 1 and 6 dimensions all > rolled up under a single data type. But in many cases, if not nearly > all cases, what people want is, specifically, a one-dimensional array. > If we were going to actually bite the bullet and create separate data > types for each possible number of array dimensions... and maybe fix > some other problems at the same time... then the effort involved in > ensuring backward-compatibility might seem like time better spent. > I understand, but I don't agree. W have to fix impractical design of arrays early. A ARRAY is 1st class - so there is not possible to use varchar2 trick. if we don't would to use GUC, what do you think about compatible extension? We can overload a system functions behave. This can solve a problem with updates and migrations. Regards Pavel > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 26 March 2013 22:57, Robert Haas <robertmhaas@gmail.com> wrote: > They hate it twice as much when the change is essentially cosmetic. > There's no functional problems with arrays as they exist today that > this change would solve. > We can't sensibly test for whether an array is empty. I'd call that a functional problem. The NULL return from array_{length,lower,upper,ndims} is those functions' way of saying their arguments failed a sanity check. So we cannot distinguish in a disciplined way between a valid, empty array, and bad arguments. If the zero-D implementation had been more polished and say, array_ndims returned zero, we had provided an array_empty function, or the existing functions threw errors for silly arguments instead of returning NULL, then I'd be more inclined to see your point. But as it stands, the zero-D implementation has always been half-baked and slightly broken, we just got used to working around it. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Josh Berkus
Date:
> I expect to lose this argument, but I think this is a terrible idea. > Users really hate it when they try to upgrade and find that they, uh, > can't, because of some application-level incompatibility like this. > They hate it twice as much when the change is essentially cosmetic. > There's no functional problems with arrays as they exist today that > this change would solve. Sure there is. How do you distinguish between an array which is NULL and an array which is empty? Also, the whole array_dims is NULL thing trips up pretty much every single PG user who uses arrays for the first time. I'd expect when we announce the fix, we'll find that many users where doing the wrong thing in the first place and didn't know why it wasn't working. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Robert Haas
Date:
On Tue, Mar 26, 2013 at 9:02 AM, Brendan Jurd <direvus@gmail.com> wrote: > On 26 March 2013 22:57, Robert Haas <robertmhaas@gmail.com> wrote: >> They hate it twice as much when the change is essentially cosmetic. >> There's no functional problems with arrays as they exist today that >> this change would solve. > > We can't sensibly test for whether an array is empty. I'd call that a > functional problem. Sure you can. Equality comparisons work just fine. rhaas=# select '{}'::int4[] = '{}'::int4[];?column? ----------t (1 row) rhaas=# select '{}'::int4[] = '{1}'::int4[];?column? ----------f (1 row) > The NULL return from array_{length,lower,upper,ndims} is those > functions' way of saying their arguments failed a sanity check. So we > cannot distinguish in a disciplined way between a valid, empty array, > and bad arguments. If the zero-D implementation had been more > polished and say, array_ndims returned zero, we had provided an > array_empty function, or the existing functions threw errors for silly > arguments instead of returning NULL, then I'd be more inclined to see > your point. But as it stands, the zero-D implementation has always > been half-baked and slightly broken, we just got used to working > around it. Well, you could easily change array_ndims() to error out if ARR_NDIM() is negative or more than MAXDIM and return NULL only if it's exactly 0. That wouldn't break backward compatibility because it would throw an error only if fed a value that shouldn't ever exist in the first place, short of a corrupted database. I imagine the other functions are amenable to similar treatment. And if neither that nor just comparing against an empty array literal floats your boat, adding an array_is_empty() function would let you test for this condition without breaking backward compatibility, too. That's overkill, I think, but it would work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 27 March 2013 06:47, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Mar 26, 2013 at 9:02 AM, Brendan Jurd <direvus@gmail.com> wrote: >> We can't sensibly test for whether an array is empty. I'd call that a >> functional problem. > > Sure you can. Equality comparisons work just fine. > > rhaas=# select '{}'::int4[] = '{}'::int4[]; The good news is, if anybody out there is using that idiom to test for emptiness, they will not be disrupted by the change. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > Well, you could easily change array_ndims() to error out if ARR_NDIM() > is negative or more than MAXDIM and return NULL only if it's exactly > 0. That wouldn't break backward compatibility because it would throw > an error only if fed a value that shouldn't ever exist in the first > place, short of a corrupted database. I imagine the other functions > are amenable to similar treatment. I haven't looked at the patch in detail, but I thought one of the key changes was that '{}' would now be interpreted as a zero-length 1-D array rather than a zero-D array. If we do that it seems a bit moot to argue about whether we should exactly preserve backwards-compatible behavior in array_ndims(), because the input it's looking at won't be the same anymore anyway. In any case, the entire point of this proposal is that the current behavior around zero-D arrays is *broken* and we don't want to be backwards-compatible with it anymore. So if you wish to argue against that opinion, do so; but it seems a bit beside the point to simply complain that backwards compatibility is being lost. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Dean Rasheed
Date:
On 26 March 2013 20:44, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Well, you could easily change array_ndims() to error out if ARR_NDIM() >> is negative or more than MAXDIM and return NULL only if it's exactly >> 0. That wouldn't break backward compatibility because it would throw >> an error only if fed a value that shouldn't ever exist in the first >> place, short of a corrupted database. I imagine the other functions >> are amenable to similar treatment. > > I haven't looked at the patch in detail, but I thought one of the key > changes was that '{}' would now be interpreted as a zero-length 1-D > array rather than a zero-D array. If we do that it seems a bit moot > to argue about whether we should exactly preserve backwards-compatible > behavior in array_ndims(), because the input it's looking at won't be > the same anymore anyway. > The patch is also allowing '{{},{},{}}' which is described up-thread as a 2-D empty array. That's pretty misleading, since it has length 3 (in the first dimension). '{{},{}}' and '{{}}' are both "more empty", but neither is completely empty. It feels as though, if you were going down that road, then for completeness you'd need a new syntax for a truly empty 2-D array, e.g., '{}^2', but I can't really think of a use-case for that, or for any of the others for that matter. You'd be making it possible to have multiple different 2-D arrays, all empty in the sense of having no elements, but which would compare differently. Pretty much all you would be able to do with such arrays would be to append additional empty arrays. > In any case, the entire point of this proposal is that the current > behavior around zero-D arrays is *broken* and we don't want to be > backwards-compatible with it anymore. So if you wish to argue against > that opinion, do so; but it seems a bit beside the point to simply > complain that backwards compatibility is being lost. > I'm not saying that the current situation is not broken. I'm just questioning whether the fix is actually any less confusing than what we have now. +1 for adding an array_is_empty() function though --- I think there is enough evidence just in this thread that the current API is confusing. We don't currently provide a definitive way to test for empty arrays, so people inevitably invent their own (all subtly different) solutions. Regards, Dean
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 28 March 2013 00:21, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > The patch is also allowing '{{},{},{}}' which is described up-thread > as a 2-D empty array. That's pretty misleading, since it has length 3 > (in the first dimension). '{{},{}}' and '{{}}' are both "more empty", > but neither is completely empty. > I'm not saying that the current situation is not broken. I'm just > questioning whether the fix is actually any less confusing than what > we have now. Well the fix is primarily about 1-D empty arrays, and in that respect it is much less confusing than what we have now. As for multidim arrays, I don't use them in the field, so I don't have a strong opinion about how (or even whether) we should support empty multidim. I included the '{{}}' syntax following comments from Tom that we should consider supporting that if we were to get rid of zero-D, but I don't really have any skin in that game. Since we require multidim arrays to be regular, perhaps they would need extents in all dimensions to be practically useful ... if you wanted to define a blank tic-tac-toe board using a 2-D array, for example, you'd probably define it as 3x3 with NULL values in each position, rather than trying to make it "empty". Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Dean Rasheed
Date:
On 27 March 2013 17:14, Brendan Jurd <direvus@gmail.com> wrote: > On 28 March 2013 00:21, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> The patch is also allowing '{{},{},{}}' which is described up-thread >> as a 2-D empty array. That's pretty misleading, since it has length 3 >> (in the first dimension). '{{},{}}' and '{{}}' are both "more empty", >> but neither is completely empty. > >> I'm not saying that the current situation is not broken. I'm just >> questioning whether the fix is actually any less confusing than what >> we have now. > > Well the fix is primarily about 1-D empty arrays, and in that respect > it is much less confusing than what we have now. Maybe. But even in 1-D, it's still jumping from having one empty array to infinitely many starting at different indexes, e.g., '{}'::int[] != '[4:3]={}'::int[]. There may be a certain logic to that, but I'm not convinced about its usefulness. Also, it is incompatible with the choice made for empty ranges, which are all normalised to a single unique empty range value --- the empty set, with no start- or end-points. I find that quite logical, and so to me, it makes most sense to also have a single unique empty array, which is then necessarily dimensionless. Regards, Dean
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 28 March 2013 09:39, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > On 27 March 2013 17:14, Brendan Jurd <direvus@gmail.com> wrote: >> Well the fix is primarily about 1-D empty arrays, and in that respect >> it is much less confusing than what we have now. > > Maybe. But even in 1-D, it's still jumping from having one empty array > to infinitely many starting at different indexes, e.g., '{}'::int[] != > '[4:3]={}'::int[]. There may be a certain logic to that, but I'm not > convinced about its usefulness. We already have the ability to define lower bounds other than 1 on arrays, and it would be inconsistent to allow that for arrays with elements, but not for arrays without. I could imagine somebody wanting to create an empty zero-based array, and then iteratively append elements to it. > Also, it is incompatible with the choice made for empty ranges, To me it doesn't make sense to try to draw parallels between arrays and ranges, they really are completely different things. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Brendan Jurd <direvus@gmail.com> writes: > On 28 March 2013 09:39, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> Maybe. But even in 1-D, it's still jumping from having one empty array >> to infinitely many starting at different indexes, e.g., '{}'::int[] != >> '[4:3]={}'::int[]. There may be a certain logic to that, but I'm not >> convinced about its usefulness. > We already have the ability to define lower bounds other than 1 on > arrays, and it would be inconsistent to allow that for arrays with > elements, but not for arrays without. Yeah, if '[1:1]={0}'::int[] is distinct from '[2:2]={0}'::int[], it's a bit hard to argue that '[1:0]={}'::int[] must not be distinct from '[2:1]={}'::int[]. If we were doing this from scratch we might drop the whole notion of nondefault lower bounds, but that ship sailed ages ago. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Josh Berkus
Date:
> We already have the ability to define lower bounds other than 1 on > arrays, and it would be inconsistent to allow that for arrays with > elements, but not for arrays without. I could imagine somebody > wanting to create an empty zero-based array, and then iteratively > append elements to it. Hmmm. You know, I think that's why we originally had array_upper() on an empty array return NULL, not 0. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Dean Rasheed
Date:
On 28 March 2013 00:04, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brendan Jurd <direvus@gmail.com> writes: >> On 28 March 2013 09:39, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >>> Maybe. But even in 1-D, it's still jumping from having one empty array >>> to infinitely many starting at different indexes, e.g., '{}'::int[] != >>> '[4:3]={}'::int[]. There may be a certain logic to that, but I'm not >>> convinced about its usefulness. > >> We already have the ability to define lower bounds other than 1 on >> arrays, and it would be inconsistent to allow that for arrays with >> elements, but not for arrays without. > > Yeah, if '[1:1]={0}'::int[] is distinct from '[2:2]={0}'::int[], > it's a bit hard to argue that '[1:0]={}'::int[] must not be > distinct from '[2:1]={}'::int[]. If we were doing this from scratch > we might drop the whole notion of nondefault lower bounds, but that > ship sailed ages ago. > You could make the exact same argument for ranges --- if '[1,1]'::int4range is distinct from '[2,2]'::int4range, why isn't '[1,1)'::int4range distinct from '[2,2)'::int4range? I see ranges and arrays as very closely related because the extents of an array are an integer range, and the extents of an empty array are an empty range. Moreover, they have almost identical API functions. There are two internally self-consistent models for handling empty ranges/arrays --- one in which empty ranges/arrays are considered not to have lower/upper bounds, and one in which they are. In the first model, there is only one empty range/array. In the second, there are infinitely many, all different. Both models can be written in a consistent way, but what seems inconsistent is to choose one model for ranges, and change to a different model for arrays. Regards, Dean
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > On 28 March 2013 00:04, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yeah, if '[1:1]={0}'::int[] is distinct from '[2:2]={0}'::int[], >> it's a bit hard to argue that '[1:0]={}'::int[] must not be >> distinct from '[2:1]={}'::int[]. > You could make the exact same argument for ranges --- if > '[1,1]'::int4range is distinct from '[2,2]'::int4range, why isn't > '[1,1)'::int4range distinct from '[2,2)'::int4range? Because an array's bounds are a core piece of the identity of the array. In another universe PG's arrays might not have been defined like that, but as things stand, they are. We chose to define ranges differently. That's okay, because ranges *are not arrays*. If they were the same, we'd not have needed to invent a separate concept. > I see ranges and arrays as very closely related because the extents of > an array are an integer range, and the extents of an empty array are > an empty range. Moreover, they have almost identical API functions. I think this argument has about as much validity as claiming that because the subscripts of an array are integers, arrays should behave like integers. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Dean Rasheed
Date:
On 28 March 2013 03:01, Tom Lane <tgl@sss.pgh.pa.us> wrote: > [snip] > ranges *are not arrays*. OK, fair enough. I guess it's the mathematician in me seeing patterns in things that behave similarly, but which are admittedly different. Is the patch also going to allow empty arrays in higher dimensions where not just the last dimension is empty? It seems as though, if it's allowing 1-by-0 arrays like '{{}}' and '[4:4][8:7]={{}}', it should also allow 0-by-0 arrays like '[4:3][8:7]={}', and 0-by-3 arrays like '[4:3][11:13]={}'. That last example seems like the more useful kind of thing to allow, since you might one day be able to append a non-empty 1-D array onto it. As it stands, the patch only allows empty 2-D arrays that are empty in the final dimension, to which the only thing you could append would be more empty 1-D arrays. Regards, Dean
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 28 March 2013 20:34, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > Is the patch also going to allow empty arrays in higher dimensions > where not just the last dimension is empty? It doesn't allow that at present. > It seems as though, if > it's allowing 1-by-0 arrays like '{{}}' and '[4:4][8:7]={{}}', it > should also allow 0-by-0 arrays like '[4:3][8:7]={}', and 0-by-3 > arrays like '[4:3][11:13]={}'. I think the array literal parser would reject this on the grounds that the brace structure doesn't match the dimension specifiers. You could modify that check to respect zero length in dimensions other than the innermost one, but it's hard to say whether it would be worth the effort. It might be instructive to hear from somebody who does (or intends to) use multidim arrays for some practical purpose, but I don't even know whether such a person exists within earshot of this list. Cheers, BJ
Pavel Stehule <pavel.stehule@gmail.com> writes: > I understand, but I don't agree. W have to fix impractical design of > arrays early. A ARRAY is 1st class - so there is not possible to use > varchar2 trick. > > if we don't would to use GUC, what do you think about compatible > extension? We can overload a system functions behave. This can solve a > problem with updates and migrations. In Common Lisp arrays are multi-dimensional too, and the 1-D arrays have a specific name, they call them vectors. If you create an array with only 1 dimension, then it's a vector. All the array API works the same on vectors. I guess we could specialize some of our API on vectors and avoid having to tell the system which dimension we are interested into when using them rather than multi-dimensional arrays. Now from the internals perspective I also guess we don't want to be generating so many more types so maybe we would need some tricks to know how to promote a 1-D array into a vector automatically? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
I am not sure if CL is best example - it has no NULL SQL logic.
Current issue is introduced by design, where empty array is similar to NULL (but not same) - what is on half way to Oracle varchar behave. But it is not consistent with other data types in postgres.
Regards
Pavel
2013/3/31 Dimitri Fontaine <dimitri@2ndquadrant.fr>
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I understand, but I don't agree. W have to fix impractical design of
> arrays early. A ARRAY is 1st class - so there is not possible to use
> varchar2 trick.
>
> if we don't would to use GUC, what do you think about compatible
> extension? We can overload a system functions behave. This can solve a
> problem with updates and migrations.
In Common Lisp arrays are multi-dimensional too, and the 1-D arrays have
a specific name, they call them vectors. If you create an array with
only 1 dimension, then it's a vector. All the array API works the same
on vectors.
I guess we could specialize some of our API on vectors and avoid having
to tell the system which dimension we are interested into when using
them rather than multi-dimensional arrays.
Now from the internals perspective I also guess we don't want to be
generating so many more types so maybe we would need some tricks to know
how to promote a 1-D array into a vector automatically?
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Robert Haas
Date:
On Tue, Mar 26, 2013 at 4:39 PM, Brendan Jurd <direvus@gmail.com> wrote: > On 27 March 2013 06:47, Robert Haas <robertmhaas@gmail.com> wrote: >> On Tue, Mar 26, 2013 at 9:02 AM, Brendan Jurd <direvus@gmail.com> wrote: >>> We can't sensibly test for whether an array is empty. I'd call that a >>> functional problem. >> >> Sure you can. Equality comparisons work just fine. >> >> rhaas=# select '{}'::int4[] = '{}'::int4[]; > > The good news is, if anybody out there is using that idiom to test for > emptiness, they will not be disrupted by the change. According to the discussion downthread, apparently they will, because you're introducing an infinitude of empty arrays, not all of which compare equal to '{}'::int4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Robert Haas
Date:
On Tue, Mar 26, 2013 at 4:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Well, you could easily change array_ndims() to error out if ARR_NDIM() >> is negative or more than MAXDIM and return NULL only if it's exactly >> 0. That wouldn't break backward compatibility because it would throw >> an error only if fed a value that shouldn't ever exist in the first >> place, short of a corrupted database. I imagine the other functions >> are amenable to similar treatment. > > I haven't looked at the patch in detail, but I thought one of the key > changes was that '{}' would now be interpreted as a zero-length 1-D > array rather than a zero-D array. If we do that it seems a bit moot > to argue about whether we should exactly preserve backwards-compatible > behavior in array_ndims(), because the input it's looking at won't be > the same anymore anyway. > > In any case, the entire point of this proposal is that the current > behavior around zero-D arrays is *broken* and we don't want to be > backwards-compatible with it anymore. So if you wish to argue against > that opinion, do so; but it seems a bit beside the point to simply > complain that backwards compatibility is being lost. I don't think the current behavior is broken. I found it counterintuitive at first, but then I got used to it. It's reasonably self-consistent: arrays can't have empty dimensions, therefore the empty array is unique and dimensionless. Is that the behavior I would have picked if I had designed the type? No, it isn't. I wouldn't have tried to support one-dimensional arrays and multi-dimensional arrays in the same data type either, nor would I have supported non-default lower bounds. But all of those ships have sailed, and the time to change them is not after people have spent 10 years building applications that work with the current behavior. If we want to introduce a new type with different, perhaps better, behavior, well, I think that might be a fine idea. But I *don't* think imposing a hard compatibility break on users of arrays is a good idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Merlin Moncure
Date:
On Mon, Apr 1, 2013 at 6:43 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Mar 26, 2013 at 4:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> Well, you could easily change array_ndims() to error out if ARR_NDIM() >>> is negative or more than MAXDIM and return NULL only if it's exactly >>> 0. That wouldn't break backward compatibility because it would throw >>> an error only if fed a value that shouldn't ever exist in the first >>> place, short of a corrupted database. I imagine the other functions >>> are amenable to similar treatment. >> >> I haven't looked at the patch in detail, but I thought one of the key >> changes was that '{}' would now be interpreted as a zero-length 1-D >> array rather than a zero-D array. If we do that it seems a bit moot >> to argue about whether we should exactly preserve backwards-compatible >> behavior in array_ndims(), because the input it's looking at won't be >> the same anymore anyway. >> >> In any case, the entire point of this proposal is that the current >> behavior around zero-D arrays is *broken* and we don't want to be >> backwards-compatible with it anymore. So if you wish to argue against >> that opinion, do so; but it seems a bit beside the point to simply >> complain that backwards compatibility is being lost. > > I don't think the current behavior is broken. I found it > counterintuitive at first, but then I got used to it. It's reasonably > self-consistent: arrays can't have empty dimensions, therefore the > empty array is unique and dimensionless. Is that the behavior I would > have picked if I had designed the type? No, it isn't. I wouldn't > have tried to support one-dimensional arrays and multi-dimensional > arrays in the same data type either, nor would I have supported > non-default lower bounds. But all of those ships have sailed, and the > time to change them is not after people have spent 10 years building > applications that work with the current behavior. If we want to > introduce a new type with different, perhaps better, behavior, well, I > think that might be a fine idea. But I *don't* think imposing a hard > compatibility break on users of arrays is a good idea. 100% agree. Also huge +1 on your backwards compatibility comments upthread -- couldn't agree more. My $company just wrapped up a one year porting effort to 9.2 from *8.1* due to compatibility issues. If you want custom array behaviors, creating a type is probably the best way unless it can be 100% proven that this will not break code. merlin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Jim Nasby
Date:
On 4/1/13 8:58 AM, Merlin Moncure wrote: > On Mon, Apr 1, 2013 at 6:43 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Tue, Mar 26, 2013 at 4:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: >>>> Well, you could easily change array_ndims() to error out if ARR_NDIM() >>>> is negative or more than MAXDIM and return NULL only if it's exactly >>>> 0. That wouldn't break backward compatibility because it would throw >>>> an error only if fed a value that shouldn't ever exist in the first >>>> place, short of a corrupted database. I imagine the other functions >>>> are amenable to similar treatment. >>> >>> I haven't looked at the patch in detail, but I thought one of the key >>> changes was that '{}' would now be interpreted as a zero-length 1-D >>> array rather than a zero-D array. If we do that it seems a bit moot >>> to argue about whether we should exactly preserve backwards-compatible >>> behavior in array_ndims(), because the input it's looking at won't be >>> the same anymore anyway. >>> >>> In any case, the entire point of this proposal is that the current >>> behavior around zero-D arrays is *broken* and we don't want to be >>> backwards-compatible with it anymore. So if you wish to argue against >>> that opinion, do so; but it seems a bit beside the point to simply >>> complain that backwards compatibility is being lost. >> >> I don't think the current behavior is broken. I found it >> counterintuitive at first, but then I got used to it. It's reasonably >> self-consistent: arrays can't have empty dimensions, therefore the >> empty array is unique and dimensionless. Is that the behavior I would >> have picked if I had designed the type? No, it isn't. I wouldn't >> have tried to support one-dimensional arrays and multi-dimensional >> arrays in the same data type either, nor would I have supported >> non-default lower bounds. But all of those ships have sailed, and the >> time to change them is not after people have spent 10 years building >> applications that work with the current behavior. If we want to >> introduce a new type with different, perhaps better, behavior, well, I >> think that might be a fine idea. But I *don't* think imposing a hard >> compatibility break on users of arrays is a good idea. > > 100% agree. Also huge +1 on your backwards compatibility comments > upthread -- couldn't agree more. My $company just wrapped up a one > year porting effort to 9.2 from *8.1* due to compatibility issues. > > If you want custom array behaviors, creating a type is probably the > best way unless it can be 100% proven that this will not break code. +1, and FWIW I'd love to see a new, cleaner array implementation. My beefs with the current system: - Adjustable lower bound is *way* more trouble than it's worth (does anyone actually have a use-case for supporting it?) - Need to be able to tell if an array is actually empty/null, which is not the same as an array where all elements are null - Enforced dimensions would be nice, though I can't say it's that big a deal - Not sure that special casing 1 dimension arrays would be worth it... perhaps enforced dimensions solve whatever problemsexist there
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 1 April 2013 21:57, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Mar 26, 2013 at 4:39 PM, Brendan Jurd <direvus@gmail.com> wrote: >> On 27 March 2013 06:47, Robert Haas <robertmhaas@gmail.com> wrote: >>> rhaas=# select '{}'::int4[] = '{}'::int4[]; >> >> The good news is, if anybody out there is using that idiom to test for >> emptiness, they will not be disrupted by the change. > > According to the discussion downthread, apparently they will, because > you're introducing an infinitude of empty arrays, not all of which > compare equal to '{}'::int4. It is not possible to construct e.g. '[3:2]={}' or '{{}, {}}' in existing applications, so there is no way for that idiom in existing applications to be broken by upgrading. If testing for equality with '{}' works now, it will also work post-upgrade. The only way for it to stop working is if somebody upgrades, and *then* goes out of their way to create an empty array with nondefault lower bounds, and then tries to compare that array against the empty array with default lower bounds, to test for emptiness. Which would be silly. Big picture: A very large number of users wouldn't be using arrays at all, and of those who are, probably a vanishingly small number (perhaps zero) care about how emptiness interacts with multiple dimensions or nondefault lower bounds. We're talking about a corner case inside a corner case here. For most folks, this upgrade would break nothing. A few (myself included) will want to grep their code for array_(lower|upper|length|dims) call sites and maybe make some tweaks. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Robert Haas
Date:
On Mon, Apr 1, 2013 at 6:40 PM, Brendan Jurd <direvus@gmail.com> wrote: > It is not possible to construct e.g. '[3:2]={}' or '{{}, {}}' in > existing applications, so there is no way for that idiom in existing > applications to be broken by upgrading. If testing for equality with > '{}' works now, it will also work post-upgrade. I suppose, strictly speaking, that that is true, but I think there are all kinds of ways for this to go wrong. Your previous complaint was that there was no easy way to check whether an array is empty. I pointed out that there is: compare it against an empty array and see whether they are equal. Now you're saying, let's change it so we can have lots of different empty arrays, each one different from all the others. Well, that breaks the world for anybody who is using the method that works today. Sure, they've probably got to have at least some kind of application change before the wheels really start to come off, but as soon as some array that's empty but not equal to {} creeps into their application by whatever means, they've got trouble. And what's the upside? Do we really want or need an infinitude of all-distinct empty arrays? Besides, even if you're not convinced that that particular thing will break applications, there are clearly going to be cases where some array functions return different results after this change than they did beforehand. That's actually the whole point of this change. Right? And, if you change the value that a built-in function returns when given the same input, that can break user applications. > The only way for it to stop working is if somebody upgrades, and > *then* goes out of their way to create an empty array with nondefault > lower bounds, and then tries to compare that array against the empty > array with default lower bounds, to test for emptiness. Which would > be silly. > > Big picture: A very large number of users wouldn't be using arrays at > all, and of those who are, probably a vanishingly small number > (perhaps zero) care about how emptiness interacts with multiple > dimensions or nondefault lower bounds. We're talking about a corner > case inside a corner case here. I think the first half of this argument is off-point, because it's true in general, for any feature, that people who aren't using a feature won't notice if the behavior of that feature changes. But we have good evidence that even obscure, largely broken features like RULEs have user bases that are significant enough that changing the behavior of those systems will generate protest. I'd be surprised if the number of people using arrays isn't far larger than the number of people using rules for table partitioning. As to the second half, part of the point of this patch is to change things so that array_length('{}'::int4[], 1) returns non-NULL. But I myself have written code that tests for an empty array by writing array_length(blahblah, 1) IS NULL. So with your proposed change, that code would break. Sure, I would have been happier, when I wrote that code, if the return value had been 0 rather than NULL. But the return value *was* NULL, so I wrote the code that way, and it worked. I can't be the only one to have done something like that. Anyone who has would get an ugly surprise if we did what you're proposing. In a large application, it might not even be clear (until you started poking at it) what exactly had changed; and if they've got tens or hundreds of thousands of lines of code (hardly unreasonable), fixing it might be quite a chore. Imposing that pain our user base does not seem justified to me. > For most folks, this upgrade would break nothing. A few (myself > included) will want to grep their code for > array_(lower|upper|length|dims) call sites and maybe make some tweaks. I think the only people for whom nothing will break are the people who aren't using arrays in the first place. Anyone who is is likely to have dependencies on the way array_lower/upper work today. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
"David E. Wheeler"
Date:
On Apr 1, 2013, at 4:59 PM, Robert Haas <robertmhaas@gmail.com> wrote: > I think the only people for whom nothing will break are the people who > aren't using arrays in the first place. Anyone who is is likely to > have dependencies on the way array_lower/upper work today. Well, what if we add new functions that return 0 for empty arrays, but leave the existing ones alone? Perhaps call them array_size(),array_first_index(), and array_last_index(). Then nothing has to break, and we can decide independently if wewant to deprecate the older functions in a future release. Or not. Best, David
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 2 April 2013 10:59, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Apr 1, 2013 at 6:40 PM, Brendan Jurd <direvus@gmail.com> wrote: >> It is not possible to construct e.g. '[3:2]={}' or '{{}, {}}' in >> existing applications, so there is no way for that idiom in existing >> applications to be broken by upgrading. If testing for equality with >> '{}' works now, it will also work post-upgrade. > > Sure, they've probably got to have at least > some kind of application change before the wheels really start to come > off, but as soon as some array that's empty but not equal to {} creeps > into their application by whatever means, they've got trouble. Constructing an array with nondefault bounds isn't happening by accident, you'd have to know about, and deliberately make use of, the obscure '[m:n]={}' syntax for array literals. How is it going to "creep in"? I note the total absence of people rending their garments over the fact that '{foo}' does not equal '[2:2]={foo}'. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 2 April 2013 11:34, David E. Wheeler <david@kineticode.com> wrote: > On Apr 1, 2013, at 4:59 PM, Robert Haas <robertmhaas@gmail.com> wrote: > >> I think the only people for whom nothing will break are the people who >> aren't using arrays in the first place. Anyone who is is likely to >> have dependencies on the way array_lower/upper work today. > > Well, what if we add new functions that return 0 for empty arrays, but leave the existing ones alone? Perhaps call themarray_size(), array_first_index(), and array_last_index(). Then nothing has to break, and we can decide independentlyif we want to deprecate the older functions in a future release. Or not. I think having an 'array_size' and an 'array_length' that behave differently would be legitimately confusing, and I can't think of any alternative function name that would concisely explain the difference in behaviour -- 'array_length_without_the_stupid_nulls' is just too long. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Pavel Stehule
Date:
Hello
2013/4/2 Brendan Jurd <direvus@gmail.com>
On 2 April 2013 11:34, David E. Wheeler <david@kineticode.com> wrote:I think having an 'array_size' and an 'array_length' that behave
> On Apr 1, 2013, at 4:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
>> I think the only people for whom nothing will break are the people who
>> aren't using arrays in the first place. Anyone who is is likely to
>> have dependencies on the way array_lower/upper work today.
>
> Well, what if we add new functions that return 0 for empty arrays, but leave the existing ones alone? Perhaps call them array_size(), array_first_index(), and array_last_index(). Then nothing has to break, and we can decide independently if we want to deprecate the older functions in a future release. Or not.
differently would be legitimately confusing, and I can't think of any
alternative function name that would concisely explain the difference
in behaviour -- 'array_length_without_the_stupid_nulls' is just too
long.
yes,
we should to do complete redesign or do nothing.
Regards
Pavel
Cheers,
BJ
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Gavin Flower
Date:
<div class="moz-cite-prefix">On 29/03/13 13:12, Brendan Jurd wrote:<br /></div><blockquote cite="mid:CADxJZo3fXdFCNS_sMyAiBgPCzbGLnYwCdnQaXLqARPntD5MFJQ@mail.gmail.com"type="cite"><pre wrap="">On 28 March 2013 20:34,Dean Rasheed <a class="moz-txt-link-rfc2396E" href="mailto:dean.a.rasheed@gmail.com"><dean.a.rasheed@gmail.com></a>wrote: </pre><blockquote type="cite"><pre wrap="">Is the patch also going to allow empty arrays in higher dimensions where not just the last dimension is empty? </pre></blockquote><pre wrap=""> It doesn't allow that at present. </pre><blockquote type="cite"><pre wrap="">It seems as though, if it's allowing 1-by-0 arrays like '{{}}' and '[4:4][8:7]={{}}', it should also allow 0-by-0 arrays like '[4:3][8:7]={}', and 0-by-3 arrays like '[4:3][11:13]={}'. </pre></blockquote><pre wrap=""> I think the array literal parser would reject this on the grounds that the brace structure doesn't match the dimension specifiers. You could modify that check to respect zero length in dimensions other than the innermost one, but it's hard to say whether it would be worth the effort. It might be instructive to hear from somebody who does (or intends to) use multidim arrays for some practical purpose, but I don't even know whether such a person exists within earshot of this list. Cheers, BJ </pre></blockquote><font size="-1">Multiple dimension arrays. 'might' be something I may need, or at least usefully use inmy current p<font size="-1">roject</font></font>.<br /><br /> I am not ready to do the database design to that detail yet.<br/><br /> However, I know I will need to do 'clever'(TM) things with multiple images areas on images, and possiblygroup several images together (and still access their individual image areas). So potentially arrays of dimension2 or greater may be useful. Certainly, some images will not have any image areas defined, some arrays might beempty.<br /><br /><br /> Cheers,<br /> Gavin<br /><br /><br />
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Florian Pflug
Date:
On Apr1, 2013, at 13:43 , Robert Haas <robertmhaas@gmail.com> wrote: > I don't think the current behavior is broken. I found it > counterintuitive at first, but then I got used to it. It's reasonably > self-consistent: arrays can't have empty dimensions, therefore the > empty array is unique and dimensionless. Is that the behavior I would > have picked if I had designed the type? No, it isn't. I wouldn't > have tried to support one-dimensional arrays and multi-dimensional > arrays in the same data type either, nor would I have supported > non-default lower bounds. But all of those ships have sailed, and the > time to change them is not after people have spent 10 years building > applications that work with the current behavior. If we want to > introduce a new type with different, perhaps better, behavior, well, I > think that might be a fine idea. But I *don't* think imposing a hard > compatibility break on users of arrays is a good idea. +1. Especially since having infinitely many variants of "empty" is IMHO hardly an improvement over the existing situation. If we're going to break compatibility, we should IMHO get rid of non-zero lower bounds all together. My guess is that the number of affected users wouldn't be much higher than for the proposed patch, and it'd allow lossless mapping to most language's native array types… best regards, Florian Pflug
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Andrew Dunstan
Date:
On 04/02/2013 02:46 PM, Florian Pflug wrote: > On Apr1, 2013, at 13:43 , Robert Haas <robertmhaas@gmail.com> wrote: >> I don't think the current behavior is broken. I found it >> counterintuitive at first, but then I got used to it. It's reasonably >> self-consistent: arrays can't have empty dimensions, therefore the >> empty array is unique and dimensionless. Is that the behavior I would >> have picked if I had designed the type? No, it isn't. I wouldn't >> have tried to support one-dimensional arrays and multi-dimensional >> arrays in the same data type either, nor would I have supported >> non-default lower bounds. But all of those ships have sailed, and the >> time to change them is not after people have spent 10 years building >> applications that work with the current behavior. If we want to >> introduce a new type with different, perhaps better, behavior, well, I >> think that might be a fine idea. But I *don't* think imposing a hard >> compatibility break on users of arrays is a good idea. > +1. > > Especially since having infinitely many variants of "empty" is IMHO > hardly an improvement over the existing situation. > > If we're going to break compatibility, we should IMHO get rid of > non-zero lower bounds all together. My guess is that the number of > affected users wouldn't be much higher than for the proposed patch, > and it'd allow lossless mapping to most language's native array types… That would actually break a HUGE number of users, since the default lower bound is 1. I have seen any number of pieces if code that rely on that. Doesn't the standard have something to say about all this? cheers andrew
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Florian Pflug
Date:
On Apr3, 2013, at 15:30 , Andrew Dunstan <andrew@dunslane.net> wrote: > On 04/02/2013 02:46 PM, Florian Pflug wrote: >> If we're going to break compatibility, we should IMHO get rid of >> non-zero lower bounds all together. My guess is that the number of >> affected users wouldn't be much higher than for the proposed patch, >> and it'd allow lossless mapping to most language's native array types… > > That would actually break a HUGE number of users, since the default lower > bound is 1. I have seen any number of pieces if code that rely on that. Uh, yeah, we should make it 1 then, not 0, then. As long as the bound is fixed, conversion to native C/Java/Ruby/Python/... arrays would still be lossless. best regards, Florian Pflug
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes: > On 04/02/2013 02:46 PM, Florian Pflug wrote: >> If we're going to break compatibility, we should IMHO get rid of >> non-zero lower bounds all together. My guess is that the number of >> affected users wouldn't be much higher than for the proposed patch, >> and it'd allow lossless mapping to most language's native array types� > That would actually break a HUGE number of users, since the default > lower bound is 1. I have seen any number of pieces if code that rely on > that. I assume he meant non-one lower bounds. But in any case, removing the option for other lower bounds would be very clearly a removal of useful functionality, so I don't see it happening ... certainly not if we're so tied to bug-compatibility that we can't redefine the number of dimensions an empty array has. I think though that the upthread argument that we'd have multiple interpretations of the same thing is bogus. To me, the core idea that's being suggested here is that '{}' should mean a zero-length 1-D array, not a zero-D array as formerly. We would still need a way to represent zero-D arrays, if only because they'd still exist on-disk in existing databases (assuming we're not willing to break pg_upgrade for this). I suggest that that ought *not* involve any braces. Perhaps '[]=' would be a suitable representation. In the other direction, ISTM that '{{},{},{}}' is a zero-by-three array, entirely distinct from '{}' or '{{}}' in dimensionality if not content. I haven't worked this out in complete detail, but if different strings mean the same thing then we don't have the representation defined right. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Gavin Flower
Date:
On 04/04/13 03:02, Florian Pflug wrote: > On Apr3, 2013, at 15:30 , Andrew Dunstan <andrew@dunslane.net> wrote: >> On 04/02/2013 02:46 PM, Florian Pflug wrote: >>> If we're going to break compatibility, we should IMHO get rid of >>> non-zero lower bounds all together. My guess is that the number of >>> affected users wouldn't be much higher than for the proposed patch, >>> and it'd allow lossless mapping to most language's native array types… >> That would actually break a HUGE number of users, since the default lower >> bound is 1. I have seen any number of pieces if code that rely on that. > Uh, yeah, we should make it 1 then, not 0, then. As long as the bound > is fixed, conversion to native C/Java/Ruby/Python/... arrays would still > be lossless. > > best regards, > Florian Pflug > > Zero as the default lower bound is consistent with most languages (especially the common ones like C, C++, Java, & Python), in fact I don't remember any language where that is not the case (ignoring SQL) - and I've written programs in about 20 languages. Maybe we should adopt the famous compromise of '0.5'? :-) Cheers, Gavin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Pavel Stehule
Date:
2013/4/3 Gavin Flower <GavinFlower@archidevsys.co.nz>
On 04/04/13 03:02, Florian Pflug wrote:On Apr3, 2013, at 15:30 , Andrew Dunstan <andrew@dunslane.net> wrote:Zero as the default lower bound is consistent with most languages (especially the common ones like C, C++, Java, & Python), in fact I don't remember any language where that is not the case (ignoring SQL) - and I've written programs in about 20 languages.On 04/02/2013 02:46 PM, Florian Pflug wrote:Uh, yeah, we should make it 1 then, not 0, then. As long as the boundIf we're going to break compatibility, we should IMHO get rid ofThat would actually break a HUGE number of users, since the default lower
non-zero lower bounds all together. My guess is that the number of
affected users wouldn't be much higher than for the proposed patch,
and it'd allow lossless mapping to most language's native array types…
bound is 1. I have seen any number of pieces if code that rely on that.
is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
be lossless.
best regards,
Florian Pflug
pascal, ADA, and ALGOL like languages
Regards
Pavel
Maybe we should adopt the famous compromise of '0.5'? :-)
Cheers,
Gavin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Gavin Flower
Date:
On 04/04/13 04:58, Pavel Stehule wrote:
ALOGOL 60 was zero based by default, as I remember deliberately setting the lower bound to 1, I managed to avoid PASCAL and I only glanced at ADA.2013/4/3 Gavin Flower <GavinFlower@archidevsys.co.nz>On 04/04/13 03:02, Florian Pflug wrote:On Apr3, 2013, at 15:30 , Andrew Dunstan <andrew@dunslane.net> wrote:Zero as the default lower bound is consistent with most languages (especially the common ones like C, C++, Java, & Python), in fact I don't remember any language where that is not the case (ignoring SQL) - and I've written programs in about 20 languages.On 04/02/2013 02:46 PM, Florian Pflug wrote:Uh, yeah, we should make it 1 then, not 0, then. As long as the boundIf we're going to break compatibility, we should IMHO get rid ofThat would actually break a HUGE number of users, since the default lower
non-zero lower bounds all together. My guess is that the number of
affected users wouldn't be much higher than for the proposed patch,
and it'd allow lossless mapping to most language's native array types…
bound is 1. I have seen any number of pieces if code that rely on that.
is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
be lossless.
best regards,
Florian Pflugpascal, ADA, and ALGOL like languagesRegardsPavel
Cheers,
Gavin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Pavel Stehule
Date:
2013/4/3 Gavin Flower <GavinFlower@archidevsys.co.nz>
ALOGOL 60 was zero based by default, as I remember deliberately setting the lower bound to 1, I managed to avoid PASCAL and I only glanced at ADA.On 04/04/13 04:58, Pavel Stehule wrote:2013/4/3 Gavin Flower <GavinFlower@archidevsys.co.nz>On 04/04/13 03:02, Florian Pflug wrote:On Apr3, 2013, at 15:30 , Andrew Dunstan <andrew@dunslane.net> wrote:Zero as the default lower bound is consistent with most languages (especially the common ones like C, C++, Java, & Python), in fact I don't remember any language where that is not the case (ignoring SQL) - and I've written programs in about 20 languages.On 04/02/2013 02:46 PM, Florian Pflug wrote:Uh, yeah, we should make it 1 then, not 0, then. As long as the boundIf we're going to break compatibility, we should IMHO get rid ofThat would actually break a HUGE number of users, since the default lower
non-zero lower bounds all together. My guess is that the number of
affected users wouldn't be much higher than for the proposed patch,
and it'd allow lossless mapping to most language's native array types…
bound is 1. I have seen any number of pieces if code that rely on that.
is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
be lossless.
best regards,
Florian Pflugpascal, ADA, and ALGOL like languagesRegardsPavel
Regards
Pavel
Pavel
Cheers,
Gavin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Pavel Stehule
Date:
ALOGOL 60 was zero based by default, as I remember deliberately setting the lower bound to 1, I managed to avoid PASCAL and I only glanced at ADA.Pavel
In Pascal and similar languages (Wirth family) is possible to define lower bound - any value (like PL/pgSQL). Depends on a school - but lot of people used 1.
I remember - it was one argument for in Pascal/C flame war.
Regards
Pavel
Regards
Pavel
Cheers,
Gavin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
>> Zero as the default lower bound is consistent with most languages >> (especially the common ones like C, C++, Java, & Python), in fact >> I don't remember any language where that is not the case (ignoring >> SQL) - and I've written programs in about 20 languages. Fortran ... Basic ... actually I'd have thought that zero was a minority position. Fashions change I guess. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
"David E. Wheeler"
Date:
On Apr 3, 2013, at 9:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fortran ... Basic ... actually I'd have thought that zero was a > minority position. Fashions change I guess. I say we turn the default lower bound up to 11. David
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Andrew Dunstan
Date:
On 04/03/2013 11:34 AM, Gavin Flower wrote: > > Zero as the default lower bound is consistent with most languages > (especially the common ones like C, C++, Java, & Python), in fact I > don't remember any language where that is not the case (ignoring SQL) > - and I've written programs in about 20 languages. Pascal and Ada are obvious examples. I'm quite sure there are others, but don't have time to do the research. cheers andrew
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Robert Haas
Date:
On Wed, Apr 3, 2013 at 12:36 PM, David E. Wheeler <david@kineticode.com> wrote: > On Apr 3, 2013, at 9:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Fortran ... Basic ... actually I'd have thought that zero was a >> minority position. Fashions change I guess. > > I say we turn the default lower bound up to 11. +1. I think that's clearly a compromise position that will make everyone equally happy. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Dean Rasheed
Date:
On 3 April 2013 15:10, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think though that the upthread argument that we'd have multiple > interpretations of the same thing is bogus. To me, the core idea that's > being suggested here is that '{}' should mean a zero-length 1-D array, > not a zero-D array as formerly. We would still need a way to represent > zero-D arrays, if only because they'd still exist on-disk in existing > databases (assuming we're not willing to break pg_upgrade for this). > I suggest that that ought *not* involve any braces. Perhaps '[]=' would > be a suitable representation. Then restoring an old SQL dump would silently convert all old zero-D arrays to zero-length 1-D arrays. Also, if you failed to completely rewrite all your application code, and accidentally missed a few functions, you could very easily end up in a situation where your tables contained a mixture of zero-D arrays and zero-length 1-D arrays, that wouldn't compare equally, leading to all sorts of nasty bugs. > In the other direction, ISTM that > '{{},{},{}}' is a zero-by-three array, Isn't that a three-by-zero array rather than a zero-by-three array? Either way, what is the use-case for it? I certainly can't remember ever using a 0x3 or 3x0 matrix. > entirely distinct from '{}' or > '{{}}' in dimensionality if not content. And '{{}}' is a one-by-zero array. There is currently no proposed syntax for representing a zero-by-zero array, or a zero-by-one array. All I see here is endless opportunity for confusion and code breakage, with very little real benefit. I actually don't think the current behaviour is broken. I find it perfectly logical and consistent to have a single dimensionless concept of an empty array. It has no elements, so questions like are they arranged in 1-D or 2-D, and what index do they start at are meaningless, and so should return NULL. The only counter-intuitive thing is that array_length() returns NULL instead of 0 for the empty array, and it doesn't take long to get used to that behaviour. Much of this seems to have grown out of a desire to fix the confusing behaviour of array_length(), but seems to have ended up in a much more confusing place. Given that array_length() takes 2 arguments, it is somewhat forgivable for it to return NULL when you ask for the array's length in a dimension that it doesn't have. Perhaps what might make the API easier to work with is a new single-argument function --- e.g., array_size() that would return the number of elements in the array --- 0 for '{}', 3 for '{1,2,3}', 6 for '{{1,2},{3,4},{5,6}}' and so on. Regards, Dean
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fortran ... Basic ... actually I'd have thought that zero was a > minority position. Fashions change I guess. When I started programming the top four languages for business programming were COBOL, BASIC, RPG II, and assembly language. (Pascal and C came later, and I never saw much use of Fortran by anyone other than mathematicians.) Except for assembly language, the subscripts for arrays either started with 1 always, or that was the default. Given when it was first developed, it's not too surprising that the SQL standard adopted 1 as the first element of an array. Which is more natural depends on whether you think of the subscript in terms of ordinal positions or offsets from the base address. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Gavin Flower
Date:
On 04/04/13 05:21, Pavel Stehule wrote:
[...]ALOGOL 60 was zero based by default, as I remember deliberately setting the lower bound to 1, I managed to avoid PASCAL and I only glanced at ADA.PavelIn Pascal and similar languages (Wirth family) is possible to define lower bound - any value (like PL/pgSQL). Depends on a school - but lot of people used 1.I remember - it was one argument for in Pascal/C flame war.RegardsPavel
At the time (just over 40 years ago!) I was adamant that arrays should start with an index of one, now I much prefer zero. Probably I prefer zero in part, because now I understand what is happening at the machine code level, and partly because zero is the default for the main languages I use.
Wasting time on Google (I have 'more' important things I 'should' be doing!), I find ALGOL 60 did not appear to have a default value for the lower index - not only that, but one could make it negative! see: http://www.csci.csusb.edu/dick/samples/algol60.syntax.html
Anyhow, I think we should standardise on zero as the initial index to be as consistent as practicable. However, not with a religious zeal at the expense of practical considerations!
Cheers,
Gavin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Andres Freund
Date:
On 2013-04-04 07:42:36 +1300, Gavin Flower wrote: > On 04/04/13 05:21, Pavel Stehule wrote: > > > >> > >> Pavel > >> > > ALOGOL 60 was zero based by default, as I remember > > deliberately setting the lower bound to 1, I managed to avoid > > PASCAL and I only glanced at ADA. > > > > > > http://en.wikipedia.org/wiki/Comparison_of_programming_languages_%28array%29 > > > > > >In Pascal and similar languages (Wirth family) is possible to define lower > >bound - any value (like PL/pgSQL). Depends on a school - but lot of people > >used 1. > > > >I remember - it was one argument for in Pascal/C flame war. > > > >Regards > > > >Pavel > > > [...] > > At the time (just over 40 years ago!) I was adamant that arrays should start > with an index of one, now I much prefer zero. Probably I prefer zero in > part, because now I understand what is happening at the machine code level, > and partly because zero is the default for the main languages I use. > > Wasting time on Google (I have 'more' important things I 'should' be > doing!), I find ALGOL 60 did not appear to have a default value for the > lower index - not only that, but one could make it negative! see: > http://www.csci.csusb.edu/dick/samples/algol60.syntax.html > > Anyhow, I think we should standardise on zero as the initial index to be as > consistent as practicable. However, not with a religious zeal at the > expense of practical considerations! Changing this now, rather than on a green field, strikes me as a pretty absurd exercise in frustrating users by breaking queries subtly. Since 1 is a value index that won't always break visible. -1 Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Gavin Flower
Date:
<div class="moz-cite-prefix">On 04/04/13 05:30, Tom Lane wrote:<br /></div><blockquote cite="mid:16677.1365006659@sss.pgh.pa.us"type="cite"><blockquote type="cite"><blockquote type="cite"><pre wrap="">Zero asthe default lower bound is consistent with most languages (especially the common ones like C, C++, Java, & Python), in fact I don't remember any language where that is not the case (ignoring SQL) - and I've written programs in about 20 languages. </pre></blockquote></blockquote><pre wrap=""> Fortran ... Basic ... actually I'd have thought that zero was a minority position. Fashions change I guess. regards, tom lane </pre></blockquote><font size="-1">I had forgotten the indexing in BASIC & FO<font size="-1">RTRAN</font>, <font size="-1">Inow recall COBOL's TABL<font size="-1">E's were equivalent to arrays and they started at on<font size="-1">e.<br/><br /><br /><font size="-1">Cheers,<br /><font size="-1">Gavin</font><br /></font></font></font></font></font>
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Kevin Grittner
Date:
Gavin Flower <GavinFlower@archidevsys.co.nz> wrote: > Anyhow, I think we should standardise on zero as the initial > index to be as consistent as practicable. If you want to suggest a default of zero for the first subscript of an array in SQL, please don't confuse the issue by using any form of the word "standard" in that proposal. There are ANSI and ISO standards for SQL, and they require that the first element of an array is one. I'm OK with *extending* the standard by *allowing* other values, but let's not flaunt the standard and break existing code by changing the *default*. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Gavin Flower
Date:
<div class="moz-cite-prefix">On 04/04/13 05:36, David E. Wheeler wrote:<br /></div><blockquote cite="mid:EE62F5DD-16EB-46C7-AD72-9743619E7650@kineticode.com"type="cite"><pre wrap="">On Apr 3, 2013, at 9:30 AM, Tom Lane<a class="moz-txt-link-rfc2396E" href="mailto:tgl@sss.pgh.pa.us"><tgl@sss.pgh.pa.us></a> wrote: </pre><blockquote type="cite"><pre wrap="">Fortran ... Basic ... actually I'd have thought that zero was a minority position. Fashions change I guess. </pre></blockquote><pre wrap=""> I say we turn the default lower bound up to 11. David </pre></blockquote><font size="-1">In keeping with the level of irrationality in this thread, ma<font size="-1">ybe we shouldset it to an irrational number like the square root of 2, or transcend our selves and make in </font></font>a transcendentalnumber like pi! :-)<br /><br /> I suppose using the square root of minus one would be consider too imaginative??? :-)<br /><br /><br /> Cheers,<br /> Gavin<br />
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Gavin Flower
Date:
On 04/04/13 07:58, Kevin Grittner wrote: > Gavin Flower <GavinFlower@archidevsys.co.nz> wrote: > >> Anyhow, I think we should standardise on zero as the initial >> index to be as consistent as practicable. > If you want to suggest a default of zero for the first subscript of > an array in SQL, please don't confuse the issue by using any form > of the word "standard" in that proposal. There are ANSI and ISO > standards for SQL, and they require that the first element of an > array is one. I'm OK with *extending* the standard by *allowing* > other values, but let's not flaunt the standard and break existing > code by changing the *default*. > > -- > Kevin Grittner > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company You omitted my rider 'However, not with a religious zeal at the expense of practical considerations!' Which was meant to cover concerns like yours. Cheers, Gavin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Andres Freund
Date:
On 2013-04-04 08:03:03 +1300, Gavin Flower wrote: > On 04/04/13 07:58, Kevin Grittner wrote: > >Gavin Flower <GavinFlower@archidevsys.co.nz> wrote: > > > >>Anyhow, I think we should standardise on zero as the initial > >>index to be as consistent as practicable. > >If you want to suggest a default of zero for the first subscript of > >an array in SQL, please don't confuse the issue by using any form > >of the word "standard" in that proposal. There are ANSI and ISO > >standards for SQL, and they require that the first element of an > >array is one. I'm OK with *extending* the standard by *allowing* > >other values, but let's not flaunt the standard and break existing > >code by changing the *default*. That's already possible: postgres=# SELECT ('[0:3]={e1,e2,e3,e4}'::text[])[0];text ------e1 (1 row) Not too convenient, but ... > You omitted my rider 'However, not with a religious zeal at the expense of > practical considerations!' Which was meant to cover concerns like yours. Given its already possible I don't understand what you propose then. A guc that allows changing the default? That would need to be included in dumps and such and would make old dumps - which won't include an explicit SET of the current value - ambigious to load. So that doesn't seem to be a good idea either. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Jim Nasby
Date:
On 4/3/13 10:34 AM, Gavin Flower wrote: > Maybe we should adopt the famous compromise of '0.5'? +0.5. ;P -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Gavin Flower
Date:
<div class="moz-cite-prefix">On 04/04/13 11:55, Jim Nasby wrote:<br /></div><blockquote cite="mid:515CB368.4020003@nasby.net"type="cite">On 4/3/13 10:34 AM, Gavin Flower wrote: <br /><blockquote type="cite">Maybewe should adopt the famous compromise of '0.5'? <br /></blockquote><br /> +0.5. ;P <br /></blockquote><fontsize="-1">Far too positive for our grim world! How about '-0,5' instead? :-)</font><br /><br /> I noticeyou call yourself a 'Data Architect' - never too sure If I should call myself <br /> a 'software architect who pretendshe can program' or <br /> a 'programmer with pretensions of being a software architect'? :-)<br />
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 4 April 2013 01:10, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think though that the upthread argument that we'd have multiple > interpretations of the same thing is bogus. To me, the core idea that's > being suggested here is that '{}' should mean a zero-length 1-D array, > not a zero-D array as formerly. We would still need a way to represent > zero-D arrays, if only because they'd still exist on-disk in existing > databases (assuming we're not willing to break pg_upgrade for this). Tom, My thought was that on-disk zero-D arrays should be converted into empty 1-D arrays (with default lower bounds of course) when they are read by array_recv. Any SQL operation on your zero-D arrays would therefore resolve as though they were 1-D. A pg_dump/restore would result in the arrays being 1-D on the restore side. If pg_upgrade conserves the zero-D array in binary form, that's okay since the receiving end will just treat it as 1-D out of array_recv anyway. My intention was that the zero-D array could continue to live indefinitely in binary form, but would never be observable as such by any application code. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Brendan Jurd <direvus@gmail.com> writes: > My thought was that on-disk zero-D arrays should be converted into > empty 1-D arrays (with default lower bounds of course) when they are > read by array_recv. Huh? array_recv would not get applied to datums coming off of disk. The only way to make this 100% transparent would be to go through every C-coded function that deals with arrays and make sure it delivers identical results for both cases. It's possible we could do that for array functions in the core code, but what about user-written extensions? In any case, the whole exercise is pointless if we don't change the visible behavior of array_dims et al. So I think the idea that this would be without visible consequence is silly. What's up for argument is just how much incompatibility is acceptable. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 4 April 2013 15:11, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brendan Jurd <direvus@gmail.com> writes: >> My thought was that on-disk zero-D arrays should be converted into >> empty 1-D arrays (with default lower bounds of course) when they are >> read by array_recv. > > Huh? array_recv would not get applied to datums coming off of disk. My mistake, sorry for the noise. > In any case, the whole exercise is pointless if we don't change the > visible behavior of array_dims et al. So I think the idea that this > would be without visible consequence is silly. What's up for argument > is just how much incompatibility is acceptable. I don't know that anyone was suggesting there would be no visible consequences of any kind. I was hoping that we could at least represent on-disk zero-D arrays as though they were 1-D. If that's not going to fly, and we are stuck with continuing to allow zero-D as a valid representation, then perhaps your '[]=' syntax would be the way to proceed. It would not be terribly difficult to rework the patch along those lines, although I have to admit "allow empty arrays with dimensions" is not nearly so satisfying a title as "exorcise zero-dimensional arrays". Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Boszormenyi Zoltan
Date:
<div class="moz-cite-prefix">2013-04-03 20:58 keltezéssel, Gavin Flower írta:<br /></div><blockquote cite="mid:515C7BC1.5050507@archidevsys.co.nz"type="cite"><div class="moz-cite-prefix">On 04/04/13 05:36, David E. Wheelerwrote:<br /></div><blockquote cite="mid:EE62F5DD-16EB-46C7-AD72-9743619E7650@kineticode.com" type="cite"><pre wrap="">OnApr 3, 2013, at 9:30 AM, Tom Lane <a class="moz-txt-link-rfc2396E" href="mailto:tgl@sss.pgh.pa.us" moz-do-not-send="true"><tgl@sss.pgh.pa.us></a>wrote: </pre><blockquote type="cite"><pre wrap="">Fortran ... Basic ... actually I'd have thought that zero was a minority position. Fashions change I guess. </pre></blockquote><pre wrap="">I say we turn the default lower bound up to 11. David </pre></blockquote><font size="-1">In keeping with the level of irrationality in this thread, ma<font size="-1">ybe we shouldset it to an irrational number like the square root of 2, or transcend our selves and make in </font></font>a transcendentalnumber like pi! :-)<br /><br /> I suppose using the square root of minus one would be consider too imaginative??? :-)<br /></blockquote><br /> Nah, that would make arrays have 2 dimensions as a minimum... :-)<br /><br /><blockquotecite="mid:515C7BC1.5050507@archidevsys.co.nz" type="cite"><br /><br /> Cheers,<br /> Gavin<br /></blockquote><br/><br /><pre class="moz-signature" cols="90">-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a> <aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a> </pre>
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Merlin Moncure
Date:
On Wed, Apr 3, 2013 at 11:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > In any case, the whole exercise is pointless if we don't change the > visible behavior of array_dims et al. So I think the idea that this > would be without visible consequence is silly. What's up for argument > is just how much incompatibility is acceptable. The only reasonable answer for this (a provably used, non-security, non-standards violating, non-gross functionality breakage case) is *zero*. Our historically cavalier attitude towards compatibility breakage has been an immense disservice to our users and encourages very bad upgrade habits and is, IMNSHO, embarrassing. Changing the way array_dims works for a minor functionality enhancement is gratuitous and should be done, if at all, via a loudly advertised deprecation/replacement cycle with a guarding GUC (yes, I hate them too, but not nearly as much as the expense of qualifying vast code bases against random compatibility breakages every release). merlin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Robert Haas
Date:
On Thu, Apr 4, 2013 at 11:10 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > The only reasonable answer for this (a provably used, non-security, > non-standards violating, non-gross functionality breakage case) is > *zero*. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Apr 4, 2013 at 11:10 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> The only reasonable answer for this (a provably used, non-security, >> non-standards violating, non-gross functionality breakage case) is >> *zero*. > +1. Well, if we're going to take that hard a line on it, then we can't change anything about array data storage or the existing functions' behavior; which leaves us with either doing nothing at all, or inventing new functions that have saner behavior while leaving the old ones in place. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 5 April 2013 07:43, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Well, if we're going to take that hard a line on it, then we can't > change anything about array data storage or the existing functions' > behavior; which leaves us with either doing nothing at all, or > inventing new functions that have saner behavior while leaving the > old ones in place. And then we are in the awkward position of trying to explain the differences in behaviour between the old and new functions ... presumably with a dash of "for historical reasons" and a sprinkling of "to preserve compatibility" in every other paragraph. The other suggestion that had been tossed around elsewhere upthread was inventing a new type that serves the demand for a straightforward mutable list, which has exactly one dimension, and which may be sensibly empty. Those few who are interested in dimensions >= 2 could keep on using "arrays", with all their backwards-compatible silliness intact, and everybody else could migrate to "lists" at their leisure. I don't hate the latter idea from a user perspective, but from a developer perspective I suspect there are valid objections to be made. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Brendan Jurd <direvus@gmail.com> writes: > The other suggestion that had been tossed around elsewhere upthread > was inventing a new type that serves the demand for a straightforward > mutable list, which has exactly one dimension, and which may be > sensibly empty. Those few who are interested in dimensions >= 2 could > keep on using "arrays", with all their backwards-compatible silliness > intact, and everybody else could migrate to "lists" at their leisure. > I don't hate the latter idea from a user perspective, but from a > developer perspective I suspect there are valid objections to be made. The real problem with that is that the existing arrays have glommed onto the syntax that is both most natural and SQL-spec-required. I don't think there is a lot of room to shove in a different kind of critter there. (There's been a remarkable lack of attention to the question of spec compliance in this thread, btw. Surely the standard has something to say on the matter of zero-length arrays?) regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 5 April 2013 13:04, Tom Lane <tgl@sss.pgh.pa.us> wrote: > (There's been a remarkable lack of attention to the question > of spec compliance in this thread, btw. Surely the standard has > something to say on the matter of zero-length arrays?) From 4.10 in my draft copy of "Foundation", arrays are one of two "collection" types (the other being multisets), and: "A collection is a composite value comprising zero or more elements, each a value of some data type DT" "The number of elements in C is the cardinality of C" "An array is a collection A in which each element is associated with exactly one ordinal position in A. If n is the cardinality of A, then the ordinal position p of an element is an integer in the range 1 (one) ≤ p ≤ n." The language specifically allows for zero elements, and does not contemplate multiple dimensions. The specification for the array constructor syntax (6.36) and array element reference by subscript (6.23) also make it fairly clear that only 1-D arrays were being considered. I'd say we've already gone way off-menu by having multidims. A more compliant approach would have been to implement arrays as 1-D only, and then maybe have a separate thing ("matrices"?) for multidims. While I was in there I noticed CARDINALITY, which would be pretty easy to add and would at least provide a more productive way to get the "real" length of an array without disrupting existing functionality: "<cardinality expression> ::= CARDINALITY <left paren> <collection value expression> <right paren>" Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Brendan Jurd <direvus@gmail.com> writes: > On 5 April 2013 13:04, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> (There's been a remarkable lack of attention to the question >> of spec compliance in this thread, btw. Surely the standard has >> something to say on the matter of zero-length arrays?) > The language specifically allows for zero elements, and does not > contemplate multiple dimensions. The specification for the array > constructor syntax (6.36) and array element reference by subscript > (6.23) also make it fairly clear that only 1-D arrays were being > considered. > I'd say we've already gone way off-menu by having multidims. Yeah, we knew that. I don't have a problem with seeing multidim arrays as an extension to the standard though. The point is that the behavior for the 1-D, default-lower-bound case ought to match the standard. > While I was in there I noticed CARDINALITY, which would be pretty easy > to add and would at least provide a more productive way to get the > "real" length of an array without disrupting existing functionality: Yeah, that would at least fix the null-result-for-empty-array problem for that particular functionality. Still, this is ammunition for the position that null results for empty arrays are just broken. BTW ... if you check the archives you will find that we had cardinality() for a short while, and removed it before 8.4 release, because we couldn't agree on what it ought to return when given a multi-dimensional array. I'm afraid that issue is still unresolved. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 5 April 2013 15:05, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brendan Jurd <direvus@gmail.com> writes: >> While I was in there I noticed CARDINALITY, which would be pretty easy >> to add and would at least provide a more productive way to get the >> "real" length of an array without disrupting existing functionality: > > Yeah, that would at least fix the null-result-for-empty-array problem > for that particular functionality. Still, this is ammunition for the > position that null results for empty arrays are just broken. > > BTW ... if you check the archives you will find that we had > cardinality() for a short while, and removed it before 8.4 release, > because we couldn't agree on what it ought to return when given a > multi-dimensional array. I'm afraid that issue is still unresolved. Well for what it's worth I would expect cardinality() to return the total number of elements in the array (per ArrayGetNItems). It's consistent with the spec's identification of an array as a "collection". You can chunk the elements into dimensions however you want, but it's still a collection of elements, and the cardinality is still the number of elements. The "nesting" interpretation doesn't accord with our internal representation, nor with our requirement that multidim arrays be regular, nor with the fact that we can't put an array of texts inside an array of ints. Our array input syntaxes for multidim arrays look nest-ish but what they produce is not nested. Cheers, BJ
Brendan Jurd <direvus@gmail.com> writes: > Well for what it's worth I would expect cardinality() to return the > total number of elements in the array (per ArrayGetNItems). It's > consistent with the spec's identification of an array as a > "collection". You can chunk the elements into dimensions however you > want, but it's still a collection of elements, and the cardinality is > still the number of elements. +1 Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Kevin Grittner
Date:
Brendan Jurd <direvus@gmail.com> wrote: > The language specifically allows for zero elements, and does not > contemplate multiple dimensions. I don't remember anything in the spec which would prohibit the data type of an array element from itself being an array, however. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 6 April 2013 01:59, Kevin Grittner <kgrittn@ymail.com> wrote: > Brendan Jurd <direvus@gmail.com> wrote: > >> The language specifically allows for zero elements, and does not >> contemplate multiple dimensions. > > I don't remember anything in the spec which would prohibit the data > type of an array element from itself being an array, however. Indeed it does not prohibit nesting arrays inside other arrays, but the multidim arrays that Postgres allows you to create are not the same thing as nested arrays. I believe that a purely to-spec implementation would allow you to make an array-of-int-arrays, but since each element is its own separate collection there would be no requirement that they have the same cardinality as each other. For example, ARRAY[[1], [2,3], [4,5,6]] is a valid collection per the spec, but Postgres won't let you create this, because Postgres is trying to create a 2-D matrix of integers, rather than a collection of collections of integers. The inability to extend multidim arrays in Postgres is another manifestation of this matrix-oriented design. Extending nested collections is a no-brainer. Extending matrices while ensuring they remain perfectly regular ... not so much. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Kevin Grittner
Date:
Brendan Jurd <direvus@gmail.com> wrote: > On 6 April 2013 01:59, Kevin Grittner <kgrittn@ymail.com> wrote: >> Brendan Jurd <direvus@gmail.com> wrote: >> >>> The language specifically allows for zero elements, and does not >>> contemplate multiple dimensions. >> >> I don't remember anything in the spec which would prohibit the data >> type of an array element from itself being an array, however. > > Indeed it does not prohibit nesting arrays inside other arrays, but > the multidim arrays that Postgres allows you to create are not the > same thing as nested arrays. > > I believe that a purely to-spec implementation would allow you to make > an array-of-int-arrays, but since each element is its own separate > collection there would be no requirement that they have the same > cardinality as each other. > > For example, ARRAY[[1], [2,3], [4,5,6]] is a valid collection per the > spec, but Postgres won't let you create this, because Postgres is > trying to create a 2-D matrix of integers, rather than a collection of > collections of integers. > > The inability to extend multidim arrays in Postgres is another > manifestation of this matrix-oriented design. Extending nested > collections is a no-brainer. Extending matrices while ensuring they > remain perfectly regular ... not so much. Your interpretation matches mine all around. It is unfortunate that we have hijacked the standard's syntax for arrays to add a matrix feature. It seems to leave us without any way forward on these issues that I like very much. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 7 April 2013 01:43, Kevin Grittner <kgrittn@ymail.com> wrote: > Brendan Jurd <direvus@gmail.com> wrote: >> Indeed it does not prohibit nesting arrays inside other arrays, but >> the multidim arrays that Postgres allows you to create are not the >> same thing as nested arrays. >> > Your interpretation matches mine all around. It is unfortunate > that we have hijacked the standard's syntax for arrays to add a > matrix feature. It really is unfortunate. I wonder if it was done in an attempt to mimic Oracle behaviour. > It seems to leave us without any way forward on > these issues that I like very much. On the specific issue of CARDINALITY, I guess we need to decide whether we are going to pretend that our array/matrix thing is actually nested. I first argued that we should not. But it occurred to me that if we do pretend, it would at least leave the door ajar if we want to do something to make our arrays more nest-like in future, without disrupting the behaviour of CARDINALITY. It is unlikely that we ever would make such a change, but given the intensity of the opposition to any kind of SQL-level behavioural changes we've had on this thread, I don't want to create any more barriers to future efforts to comport with the spec. So how about: * we add CARDINALITY, make it work like array_length(a, 1) except that it returns zero for empty arrays, and * we add array_num_items, which exposes the internal ArrayGetNItems, and returns zero for empty arrays. As in: CARDINALITY(ARRAY[[1,2], [3,4], [5,6]]) => 3 array_num_items(ARRAY[[1,2], [3,4], [5,6]]) => 6 Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Brendan Jurd <direvus@gmail.com> writes: > On 7 April 2013 01:43, Kevin Grittner <kgrittn@ymail.com> wrote: >> Your interpretation matches mine all around. It is unfortunate >> that we have hijacked the standard's syntax for arrays to add a >> matrix feature. > It really is unfortunate. I wonder if it was done in an attempt to > mimic Oracle behaviour. Hardly likely. That code goes back to Berkeley days (PostQUEL) --- there is clear ancestry from the array code in Postgres v4r2 released June 1994. It's more or less a coincidence that it matches the SQL spec at all, and I'd be astonished if it matched Oracle particularly closely. > On the specific issue of CARDINALITY, I guess we need to decide > whether we are going to pretend that our array/matrix thing is > actually nested. I first argued that we should not. But it occurred > to me that if we do pretend, it would at least leave the door ajar if > we want to do something to make our arrays more nest-like in future, > without disrupting the behaviour of CARDINALITY. This seems to be exactly the same uncertainty that we couldn't resolve back in the 8.4 devel cycle, for exactly the same reasons. I don't see that the discussion has moved forward any :-( regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 8 April 2013 16:09, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brendan Jurd <direvus@gmail.com> writes: >> On the specific issue of CARDINALITY, I guess we need to decide >> whether we are going to pretend that our array/matrix thing is >> actually nested. I first argued that we should not. But it occurred >> to me that if we do pretend, it would at least leave the door ajar if >> we want to do something to make our arrays more nest-like in future, >> without disrupting the behaviour of CARDINALITY. > > This seems to be exactly the same uncertainty that we couldn't resolve > back in the 8.4 devel cycle, for exactly the same reasons. I don't see > that the discussion has moved forward any :-( > I had a poke around in the archives, and it seems to me that the major argument that was advanced in favour of making cardinality() return the total number of items was ... we don't have anything that does that yet. That's why I'm proposing we add array_num_items as well -- I do think there should be a function for this, I just don't think cardinality fits the bill. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Greg Stark
Date:
On Thu, Apr 4, 2013 at 4:10 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
The only reasonable answer for this (a provably used, non-security,
non-standards violating, non-gross functionality breakage case) is
*zero*. Our historically cavalier attitude towards compatibility
breakage has been an immense disservice to our users and encourages
very bad upgrade habits and is, IMNSHO, embarrassing.
The flip side of this is that the ability to make improvements freely is one of the biggest strengths of free software over commercial software. Oracle and Microsoft invest *tons* of money in maintaining huge libraries of backward compatibility code and drastically limits their ability to keep making improvements.
greg
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Josh Berkus
Date:
All, As much as I have a keen interest in this feature, it isn't (AFAIK) being considered for 9.3. Given that it's generated a fair amount of controversy, could we table it until 9.3 beta? There's still plenty of unresolved 9.3 patches in the queue. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Bruce Momjian
Date:
On Mon, Apr 8, 2013 at 04:24:31PM -0700, Josh Berkus wrote: > All, > > As much as I have a keen interest in this feature, it isn't (AFAIK) > being considered for 9.3. Given that it's generated a fair amount of > controversy, could we table it until 9.3 beta? There's still plenty of > unresolved 9.3 patches in the queue. Can someone add a TODO item or a link to the thread? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 9 April 2013 09:24, Josh Berkus <josh@agliodbs.com> wrote: > As much as I have a keen interest in this feature, it isn't (AFAIK) > being considered for 9.3. Given that it's generated a fair amount of > controversy, could we table it until 9.3 beta? There's still plenty of > unresolved 9.3 patches in the queue. No problem. I certainly wasn't expecting it to run for 90 messages when I started out. I'll pipe down for now and resume after the beta. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Josh Berkus
Date:
On 04/08/2013 07:16 PM, Brendan Jurd wrote: > On 9 April 2013 09:24, Josh Berkus <josh@agliodbs.com> wrote: >> As much as I have a keen interest in this feature, it isn't (AFAIK) >> being considered for 9.3. Given that it's generated a fair amount of >> controversy, could we table it until 9.3 beta? There's still plenty of >> unresolved 9.3 patches in the queue. > > No problem. I certainly wasn't expecting it to run for 90 messages > when I started out. I'll pipe down for now and resume after the beta. What's the status on this patch and current approach to ZDA? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 12 June 2013 04:43, Josh Berkus <josh@agliodbs.com> wrote: > What's the status on this patch and current approach to ZDA? Alright, it might be a good idea to have a quick recap. Last time, on Arrays Of Our Lives ... So I proposed and posted a patch aimed at deprecating zero-D arrays, and replacing them with 1-D empty arrays. That kicked off a long discussion, with some folks (Robert, Merlin, Florian et al) vigorously opposing any change in the behaviour of the array_(length|ndims|lower|upper) functions. To those folks, the behaviour would have to be a lot worse than it currently is to justify breaking compatibility with existing applications. The idea of using a GUC to smooth over the compatibility break was suggested, and firmly rejected. The idea of creating an entirely new type with nicer behaviours came up, but that wouldn't really fly because arrays have already hogged all the best syntax. Since compatibility breakage is so contentious, I suggested that we forget about changing the array representation and just add new functions with more sensible behaviours: * cardinality(anyarray) to return the length of the first dimension, zero if empty, and * array_num_items(anyarray) to return the total number of element positions per ArrayGetNItems, zero if empty. There have been attempts to add a cardinality function in the past, as it is required by the SQL spec, but these attempts have stalled when trying to decide how it should handle multidim arrays. Having it return the length of the first dimension is the more spec-compatible way to go, but some folks argued that it should work as ArrayGetNItems, because we don't already have a function for that at the SQL level. Therefore I propose we add cardinality() per the spec, and another function to expose ArrayGetNItems. And that's about where we got to, when the whole discussion was put on a time-out to make room for the beta. I am withdrawing the original zero-D patch in favour of the proposed new functions. If you have an opinion about that, please do chime in.Depending on how that goes I may post a patch implementingmy new proposal in the next few days. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
"David E. Wheeler"
Date:
On Jun 11, 2013, at 3:09 PM, Brendan Jurd <direvus@gmail.com> wrote: > There have been attempts to add a cardinality function in the past, as > it is required by the SQL spec, but these attempts have stalled when > trying to decide how it should handle multidim arrays. Having it > return the length of the first dimension is the more spec-compatible > way to go, but some folks argued that it should work as > ArrayGetNItems, because we don't already have a function for that at > the SQL level. Therefore I propose we add cardinality() per the spec, > and another function to expose ArrayGetNItems. > > And that's about where we got to, when the whole discussion was put on > a time-out to make room for the beta. > > I am withdrawing the original zero-D patch in favour of the proposed > new functions. If you have an opinion about that, please do chime in. > Depending on how that goes I may post a patch implementing my new > proposal in the next few days. +1 to this proposal. Modulo function names, perhaps. I don’t much care what they're called, as long as the work as you describehere. Best, David Array Complainer
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Dean Rasheed
Date:
On 12 June 2013 04:50, David E. Wheeler <david@justatheory.com> wrote: > On Jun 11, 2013, at 3:09 PM, Brendan Jurd <direvus@gmail.com> wrote: > >> There have been attempts to add a cardinality function in the past, as >> it is required by the SQL spec, but these attempts have stalled when >> trying to decide how it should handle multidim arrays. Having it >> return the length of the first dimension is the more spec-compatible >> way to go, but some folks argued that it should work as >> ArrayGetNItems, because we don't already have a function for that at >> the SQL level. Therefore I propose we add cardinality() per the spec, >> and another function to expose ArrayGetNItems. >> >> And that's about where we got to, when the whole discussion was put on >> a time-out to make room for the beta. >> >> I am withdrawing the original zero-D patch in favour of the proposed >> new functions. If you have an opinion about that, please do chime in. >> Depending on how that goes I may post a patch implementing my new >> proposal in the next few days. > > +1 to this proposal. Modulo function names, perhaps. I don’t much care what they're called, as long as the work as youdescribe here. > +1 for having a function to return the total number of elements in an array, because that's something that's currently missing from SQL. However, I think that CARDINALITY() should be that function. I'm not convinced that having CARDINALITY() return the length of the first dimension is more spec-compatible, since our multi-dimensional arrays aren't nested arrays, and it seems unlikely that they ever will be. I'd argue that it's at least equally spec-compatible to have CARDINALITY() return the total number of elements in the array, if you think of a multi-dimensional array as a collection of elements arranged in a regular pattern. Also, the spec describes CARDINALITY() and UNNEST() using the same language, and I think it's implicit in a couple of places that CARDINALITY() should match the number of rows returned by UNNEST(), which we've already implemented as fully unnesting every element. We're about to add ORDINALITY to UNNEST(), and to me it would be very odd to have the resulting maximum ordinality exceed the array's cardinality. Regards, Dean
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 12 June 2013 18:22, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > +1 for having a function to return the total number of elements in an > array, because that's something that's currently missing from SQL. > > However, I think that CARDINALITY() should be that function. > > I'm not convinced that having CARDINALITY() return the length of the > first dimension is more spec-compatible, since our multi-dimensional > arrays aren't nested arrays, and it seems unlikely that they ever will > be. I'd argue that it's at least equally spec-compatible to have > CARDINALITY() return the total number of elements in the array, if you > think of a multi-dimensional array as a collection of elements > arranged in a regular pattern. It's true that our multidims aren't nested, but they are the nearest thing we have. If we want to keep the door open for future attempts to nudge multidim arrays into closer approximation of nested arrays, it would be better to have the nested interpretation of CARDINALITY. Given what we've just gone through with array_length, it seems that once we select a behaviour for CARDINALITY, we will be stuck with it permanently. The problem with thinking of our multidim arrays as just a weirdly crumpled arrangement of a single collection, is that we've already abused the nesting syntax for declaring them. > Also, the spec describes CARDINALITY() and UNNEST() using the same > language, and I think it's implicit in a couple of places that > CARDINALITY() should match the number of rows returned by UNNEST(), > which we've already implemented as fully unnesting every element. > > We're about to add ORDINALITY to UNNEST(), and to me it would be very > odd to have the resulting maximum ordinality exceed the array's > cardinality. Yeah, that makes sense. Well the good news is that either way, CARDINALITY will do what people want in the most common case where the array is one-dimensional. Multidim arrays are why we can't have nice things. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Robert Haas
Date:
On Wed, Jun 12, 2013 at 5:05 AM, Brendan Jurd <direvus@gmail.com> wrote: > On 12 June 2013 18:22, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> +1 for having a function to return the total number of elements in an >> array, because that's something that's currently missing from SQL. >> >> However, I think that CARDINALITY() should be that function. >> >> I'm not convinced that having CARDINALITY() return the length of the >> first dimension is more spec-compatible, since our multi-dimensional >> arrays aren't nested arrays, and it seems unlikely that they ever will >> be. I'd argue that it's at least equally spec-compatible to have >> CARDINALITY() return the total number of elements in the array, if you >> think of a multi-dimensional array as a collection of elements >> arranged in a regular pattern. > > It's true that our multidims aren't nested, but they are the nearest > thing we have. If we want to keep the door open for future attempts > to nudge multidim arrays into closer approximation of nested arrays, > it would be better to have the nested interpretation of CARDINALITY. I think there's just about zero chance of something like that ever happening. The problem is that our type system just can't support it.A function or operator that takes an array needs todeclare whether it's going to return an array or whether it's going to return the base type. It can't decide to return one or the other at run-time depending on the dimensionality of the array. For this to really work, we'd need the number of dimensions to be baked into the array type. The obvious implementation would be to have N array types per base type rather than 1, each with a different number of dimensions. Then a subscripting function which took a 1-dimensional array could return anyelement, and the similarly named function which took a 2-dimensional array could return a 1-dimensional array. I believe the reason it wasn't done this way initially was because of pg_type bloat; having 6 extra type definitions for every type we support is unappealing. We could force them to be explicitly declared as we do for range types. Or we could rewrite a whole lotta code to understand a "type" as something more complex than "an OID from pg_type", so that we don't need pre-defined entries in pg_type for array types in the first place. But none of these things are nudges. Making any real improvement in this area is going to take major surgery, not a nudge. > Multidim arrays are why we can't have nice things. Yeah, I think that was not our best moment. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Josh Berkus
Date:
>> Multidim arrays are why we can't have nice things. > > Yeah, I think that was not our best moment. :-( They were one of those hacks which looked really clever at the time, but proved not to be so. Unfortunately, they *are* useful, and are being used; I use MD arrays all the time to push data in and out of PL/R., and now that we have a JSON type I'm using them to generate arrays-of-arrays using JSON conversion functions. I'm sure many others are doing the same, so there's no killing the feature. Actually, if you think about it, we've never had "arrays" in PostgreSQL ... we have always had matrixes. If you think about things that way, most of the current functionality makes sense. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes: > Actually, if you think about it, we've never had "arrays" in PostgreSQL > ... we have always had matrixes. If you think about things that way, > most of the current functionality makes sense. Arguably, the only thing wrong with multidim arrays is that they're not what the SQL standard asks for. However, the original point in this thread was that we have some very bizarre corner-case behavior for empty arrays. I'm going to be disappointed if all we can get out of this is a cardinality() function, and nothing is done about the empty-array semantics. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Josh Berkus
Date:
On 06/12/2013 11:01 AM, Tom Lane wrote: > Arguably, the only thing wrong with multidim arrays is that they're not > what the SQL standard asks for. However, the original point in this > thread was that we have some very bizarre corner-case behavior for empty > arrays. I'm going to be disappointed if all we can get out of this is > a cardinality() function, and nothing is done about the empty-array > semantics. Well, we can't change the zero-dim behavior without breaking backwards compatibility. And enough people piled on to say NO to that, that it went by the wayside. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes: > On 06/12/2013 11:01 AM, Tom Lane wrote: >> I'm going to be disappointed if all we can get out of this is >> a cardinality() function, and nothing is done about the empty-array >> semantics. > Well, we can't change the zero-dim behavior without breaking backwards > compatibility. And enough people piled on to say NO to that, that it > went by the wayside. Meh. Robert was pretty vocal about it, but it wasn't clear to me that his was the majority opinion, and in any case there wasn't much consideration given to compromises falling somewhere between "no changes" and the rather drastic solution Brendan proposed. For instance, it's really hard to believe that this is a good thing: regression=# select array_dims('{}'::int[]) is null;?column? ----------t (1 row) Whatever you think the dimensions of that are, surely they're not unknown. regards, tom lane
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Merlin Moncure
Date:
On Wed, Jun 12, 2013 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Josh Berkus <josh@agliodbs.com> writes: >> On 06/12/2013 11:01 AM, Tom Lane wrote: >>> I'm going to be disappointed if all we can get out of this is >>> a cardinality() function, and nothing is done about the empty-array >>> semantics. > >> Well, we can't change the zero-dim behavior without breaking backwards >> compatibility. And enough people piled on to say NO to that, that it >> went by the wayside. > > Meh. Robert was pretty vocal about it, but it wasn't clear to me that > his was the majority opinion, and in any case there wasn't much > consideration given to compromises falling somewhere between "no > changes" and the rather drastic solution Brendan proposed. For > instance, it's really hard to believe that this is a good thing: > > regression=# select array_dims('{}'::int[]) is null; > ?column? > ---------- > t > (1 row) > > Whatever you think the dimensions of that are, surely they're not > unknown. But, couldn't that be solved by deprecating that function and providing a more sensible alternatively named version? merlin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Darren Duncan
Date:
On 2013.06.12 10:11 AM, Josh Berkus wrote: >>> Multidim arrays are why we can't have nice things. >> >> Yeah, I think that was not our best moment. :-( > > Actually, if you think about it, we've never had "arrays" in PostgreSQL > ... we have always had matrixes. If you think about things that way, > most of the current functionality makes sense. If the type system were changed so that arrays were were always just one-dimensional, you can define your matrix simply as a binary relation type whose primary key attribute has the type of a fixed-length array of integers, where the number of elements in the array is the number of dimensions in the matrix, and the array elements themselves defined the coordinates in the matrix. This design confers a number of benefits. Also, the case of the zero-dimension matrix needs no special treatment; the key array has zero elements. Would that not work? -- Darren Duncan
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 13 June 2013 04:26, Merlin Moncure <mmoncure@gmail.com> wrote: > On Wed, Jun 12, 2013 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Josh Berkus <josh@agliodbs.com> writes: >>> On 06/12/2013 11:01 AM, Tom Lane wrote: >>>> I'm going to be disappointed if all we can get out of this is >>>> a cardinality() function, and nothing is done about the empty-array >>>> semantics. I would be disappointed too, but on the other hand, CARDINALITY is required by the spec and anything would be better than nothing. >> Meh. Robert was pretty vocal about it, but it wasn't clear to me that >> his was the majority opinion, and in any case there wasn't much >> consideration given to compromises falling somewhere between "no >> changes" and the rather drastic solution Brendan proposed. I'm all for looking into possible compromises, and will happily take any improvements to this mess I think I can get past the compatibility maximalist caucus. >> regression=# select array_dims('{}'::int[]) is null; >> ?column? >> ---------- >> t >> (1 row) >> >> Whatever you think the dimensions of that are, surely they're not >> unknown. I don't think anyone has actually tried to defend the behaviour of the array functions w.r.t. empty arrays. Even the opponents of the original proposal agreed that the behaviour was silly, they just didn't want to fix it, on account of the upgrade burden. > But, couldn't that be solved by deprecating that function and > providing a more sensible alternatively named version? And what would you name that function? array_dims2? I can't think of a name that makes the difference in behaviour apparent. Can you imagine the documentation for that? array_dims - Returns the dimensions of the array, unless it is empty in which case NULL. array_proper_dims - Returns the dimensions of the array. array_ndims - Returns the number of dimension, unless it is empty in which case NULL. array_proper_ndims - Returns the number of dimensions. ... and so on for _length, _upper and _lower. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Merlin Moncure
Date:
On Wed, Jun 12, 2013 at 10:00 PM, Brendan Jurd <direvus@gmail.com> wrote: > On 13 June 2013 04:26, Merlin Moncure <mmoncure@gmail.com> wrote: >> On Wed, Jun 12, 2013 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Josh Berkus <josh@agliodbs.com> writes: >>>> On 06/12/2013 11:01 AM, Tom Lane wrote: >>>>> I'm going to be disappointed if all we can get out of this is >>>>> a cardinality() function, and nothing is done about the empty-array >>>>> semantics. > > I would be disappointed too, but on the other hand, CARDINALITY is > required by the spec and anything would be better than nothing. certainly. >> But, couldn't that be solved by deprecating that function and >> providing a more sensible alternatively named version? > > And what would you name that function? array_dims2? I can't think of > a name that makes the difference in behaviour apparent. Can you > imagine the documentation for that? I don't know the answer to that, but I think it's hard to argue that deprecating and documenting a few functions is a heavier burden on your users than having to sift through older arcane code before upgrading to the latest version of the database. We're not the only ones stuck with lousy old functions (C finally ditched gets() in the 2011 standard). I also happen to think the current array_api function names are not particularly great (especially array_upper/array_lower) so I won't shed too many tears. merlin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
"David E. Wheeler"
Date:
On Jun 12, 2013, at 8:00 PM, Brendan Jurd <direvus@gmail.com> wrote: > array_dims - Returns the dimensions of the array, unless it is empty > in which case NULL. > array_proper_dims - Returns the dimensions of the array. > array_ndims - Returns the number of dimension, unless it is empty in > which case NULL. > array_proper_ndims - Returns the number of dimensions. > > ... and so on for _length, _upper and _lower. Similar things should have dissimilar names. I propose: <bikeshedding> Old | New --------------+--------------array_dims | array_descarray_ndims | array_deptharray_length | array_sizearray_lower | array_startarray_upper | array_finish The last two are meh, but it’s a place to start… </bikeshedding> David
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Brendan Jurd
Date:
On 14 June 2013 03:53, David E. Wheeler <david@justatheory.com> wrote: > Similar things should have dissimilar names. I propose: > > <bikeshedding> > > Old | New > --------------+-------------- > array_dims | array_desc array_bounds? > array_ndims | array_depth > array_length | array_size > array_lower | array_start > array_upper | array_finish > > The last two are meh, but it’s a place to start… I think that even with the most dissimilar names we can come up with, this is going to confuse people. But it is still better than doing nothing. I wonder whether, if we go in this direction, we could still use some of the work I did on deprecating zero-D arrays. Let's say the old functions keep doing what they do now, and we teach them to treat all empty arrays the same way they currently treat zero-D arrays (return NULL). The new functions treat zero-D arrays as though they were 1-D empty with default bounds, and we add CARDINALITY per ArrayGetNItems. This way, applications would not be broken by upgrading, and we'd be giving people a way to opt-in to a better API. Cheers, BJ
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Bruce Momjian
Date:
On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote: > >> But, couldn't that be solved by deprecating that function and > >> providing a more sensible alternatively named version? > > > > And what would you name that function? array_dims2? I can't think of > > a name that makes the difference in behaviour apparent. Can you > > imagine the documentation for that? > > I don't know the answer to that, but I think it's hard to argue that > deprecating and documenting a few functions is a heavier burden on > your users than having to sift through older arcane code before > upgrading to the latest version of the database. We're not the only > ones stuck with lousy old functions (C finally ditched gets() in the > 2011 standard). I also happen to think the current array_api function > names are not particularly great (especially array_upper/array_lower) > so I won't shed too many tears. Sorry to be late on this, but are you saying people have code that is testing: select array_dims('{}'::int[]) for a NULL return, and they would need to change that to test for zero? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Merlin Moncure
Date:
On Thu, Jun 20, 2013 at 2:58 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote: >> >> But, couldn't that be solved by deprecating that function and >> >> providing a more sensible alternatively named version? >> > >> > And what would you name that function? array_dims2? I can't think of >> > a name that makes the difference in behaviour apparent. Can you >> > imagine the documentation for that? >> >> I don't know the answer to that, but I think it's hard to argue that >> deprecating and documenting a few functions is a heavier burden on >> your users than having to sift through older arcane code before >> upgrading to the latest version of the database. We're not the only >> ones stuck with lousy old functions (C finally ditched gets() in the >> 2011 standard). I also happen to think the current array_api function >> names are not particularly great (especially array_upper/array_lower) >> so I won't shed too many tears. > > Sorry to be late on this, but are you saying people have code that is > testing: > > select array_dims('{}'::int[]) > > for a NULL return, and they would need to change that to test for zero? Kinda -- what I'm saying is you just don't go around changing function behaviors to make them 'better' unless the affected behavior was specifically reserved as undefined. The fact is nobody knows how many users will be affected and the extent of the ultimate damage (pro tip: it's always more and worse than expected); I'm astonished it's even being considered. merlin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Bruce Momjian
Date:
On Thu, Jun 20, 2013 at 03:33:24PM -0500, Merlin Moncure wrote: > On Thu, Jun 20, 2013 at 2:58 PM, Bruce Momjian <bruce@momjian.us> wrote: > > On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote: > >> >> But, couldn't that be solved by deprecating that function and > >> >> providing a more sensible alternatively named version? > >> > > >> > And what would you name that function? array_dims2? I can't think of > >> > a name that makes the difference in behaviour apparent. Can you > >> > imagine the documentation for that? > >> > >> I don't know the answer to that, but I think it's hard to argue that > >> deprecating and documenting a few functions is a heavier burden on > >> your users than having to sift through older arcane code before > >> upgrading to the latest version of the database. We're not the only > >> ones stuck with lousy old functions (C finally ditched gets() in the > >> 2011 standard). I also happen to think the current array_api function > >> names are not particularly great (especially array_upper/array_lower) > >> so I won't shed too many tears. > > > > Sorry to be late on this, but are you saying people have code that is > > testing: > > > > select array_dims('{}'::int[]) > > > > for a NULL return, and they would need to change that to test for zero? > > Kinda -- what I'm saying is you just don't go around changing function > behaviors to make them 'better' unless the affected behavior was > specifically reserved as undefined. The fact is nobody knows how many > users will be affected and the extent of the ultimate damage (pro tip: > it's always more and worse than expected); I'm astonished it's even > being considered. Well, I think the question is how many people have such arrays that will be effected. If we don't do something, we live with this odd behavior forever. We have been willing to make some bold decisions in the past to improve user experience, and it mostly has worked out well. I disagree that it is always worse than expected. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Merlin Moncure
Date:
On Thu, Jun 20, 2013 at 6:40 PM, Bruce Momjian <bruce@momjian.us> wrote: >> Kinda -- what I'm saying is you just don't go around changing function >> behaviors to make them 'better' unless the affected behavior was >> specifically reserved as undefined. The fact is nobody knows how many >> users will be affected and the extent of the ultimate damage (pro tip: >> it's always more and worse than expected); I'm astonished it's even >> being considered. > > Well, I think the question is how many people have such arrays that will > be effected. If we don't do something, we live with this odd behavior > forever. We have been willing to make some bold decisions in the past > to improve user experience, and it mostly has worked out well. I > disagree that it is always worse than expected. Well, you can have the last word (although 'bold' was an interesting word choice, heh) -- I feel guilty enough about beating up Brendan already. I feel this way every time compatibility changes come up, so it's nothing specific to this patch really. merlin
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Bruce Momjian
Date:
On Thu, Jun 20, 2013 at 07:13:48PM -0500, Merlin Moncure wrote: > On Thu, Jun 20, 2013 at 6:40 PM, Bruce Momjian <bruce@momjian.us> wrote: > >> Kinda -- what I'm saying is you just don't go around changing function > >> behaviors to make them 'better' unless the affected behavior was > >> specifically reserved as undefined. The fact is nobody knows how many > >> users will be affected and the extent of the ultimate damage (pro tip: > >> it's always more and worse than expected); I'm astonished it's even > >> being considered. > > > > Well, I think the question is how many people have such arrays that will > > be effected. If we don't do something, we live with this odd behavior > > forever. We have been willing to make some bold decisions in the past > > to improve user experience, and it mostly has worked out well. I > > disagree that it is always worse than expected. > > Well, you can have the last word (although 'bold' was an interesting > word choice, heh) -- I feel guilty enough about beating up Brendan > already. I feel this way every time compatibility changes come up, so > it's nothing specific to this patch really. Well, sometimes we underestimate the impact of changes, sometimes we overestimate. The big problem is weighing the short-term problems of change but not the long-term benefit of a change. This array problem goes back to at least 2008: http://www.postgresql.org/message-id/28026.1224611437@sss.pgh.pa.us so we have at least five years of confusion by not changing it then. I am not saying we need to change it, but do think we need to weigh both issues. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Josh Berkus
Date:
Bruce, > Well, sometimes we underestimate the impact of changes, sometimes we > overestimate. The big problem is weighing the short-term problems of > change but not the long-term benefit of a change. This array problem > goes back to at least 2008: > > http://www.postgresql.org/message-id/28026.1224611437@sss.pgh.pa.us > > so we have at least five years of confusion by not changing it then. I > am not saying we need to change it, but do think we need to weigh both > issues. As much as I hate the current behavior (my first response was "yeah, fix those babies!"), I think we don't have a choice about creating new function names and then waiting three years to deprecate the old ones. We really can't afford to put obstacles in the way of people upgrading, especially over an issue as minor as this one. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
From
Bruce Momjian
Date:
On Thu, Jun 20, 2013 at 06:28:07PM -0700, Josh Berkus wrote: > Bruce, > > > Well, sometimes we underestimate the impact of changes, sometimes we > > overestimate. The big problem is weighing the short-term problems of > > change but not the long-term benefit of a change. This array problem > > goes back to at least 2008: > > > > http://www.postgresql.org/message-id/28026.1224611437@sss.pgh.pa.us > > > > so we have at least five years of confusion by not changing it then. I > > am not saying we need to change it, but do think we need to weigh both > > issues. > > As much as I hate the current behavior (my first response was "yeah, fix > those babies!"), I think we don't have a choice about creating new > function names and then waiting three years to deprecate the old ones. > We really can't afford to put obstacles in the way of people upgrading, > especially over an issue as minor as this one. Perhaps we need to mark the TODO item as "will not fix". -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +