Thread: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

On 17 March 2013 05:19, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Brendan Jurd <direvus@gmail.com> writes:
>> On 16 March 2013 09:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> The thing is that that syntax creates an array of zero dimensions,
>>> not one that has 1 dimension and zero elements.
>
>> I'm going to ask the question that immediately comes to mind: Is there
>> anything good at all about being able to define a zero-dimensional
>> array?
>
> Perhaps not.  I think for most uses, a 1-D zero-length array would be
> just as good.  I guess what I'd want to know is whether we also need
> to support higher-dimensional zero-size arrays, and if so, what does
> the I/O syntax for those look like?
>
> Another fly in the ointment is that if we do redefine '{}' as meaning
> something other than a zero-D array, how will we handle existing
> database entries that are zero-D arrays?
>

Hello hackers,

I submit a patch to rectify the weird and confusing quirk of Postgres
to use "zero dimensions" to signify an empty array.

Rationale:
The concept of an array without dimensions is a) incoherent, and b)
leads to astonishing results from our suite of user-facing array
functions.  array_length, array_dims, array_ndims, array_lower and
array_upper all return NULL when presented such an array.

When a user writes ARRAY[]::int[], they expect to get an empty array,
but instead we've been giving them a bizarre semi-functional
proto-array.  Not cool.

Approach:
The patch teaches postgres to regard zero as an invalid number of
dimensions (which it very much is), and allows instead for fully armed
and operational empty arrays.

The simplest form of empty array is the 1-D empty array (ARRAY[] or
'{}') but the patch also allows for empty arrays over multiple
dimensions, meaning that the final dimension has a length of zero, but
the other dimensions may have any valid length.  For example,
'{{},{},{}}' is a 2-D empty array with dimension lengths {3,0} and
dimension bounds [1:3][1:0].

An empty array dimension may have any valid lower bound, but by
default the lower bound will be 1 and the upper bound will therefore
be 0.

Any zero-D arrays read in from existing database entries will be
output as 1-D empty arrays from array_recv.

There is an existing limitation where you cannot extend a multi-D
array by assigning values to subscripts or slices.  I've made no
attempt to address that limitation.

The patch improves some error reporting, particularly by adding
errdetail messages for ereports of problems parsing a curly-brace
array literal.

There is some miscellaneous code cleanup included; I moved the
hardcoded characters '{', '}', etc. into constants, unwound a
superfluous inner loop from ArrayCount, and corrected some mistakes in
code comments and error texts.  If preferred for reviewing, I can
split those changes (and/or the new errdetails) out into a separate
patch.

Incompatibility:
This patch introduces an incompatible change in the behaviour of the
aforementioned array functions -- instead of returning NULL for empty
arrays they return meaningful values.  Applications relying on the old
behaviour to test for emptiness may be disrupted.  One can
future-proof against this by changing e.g.

IF array_length(arr, 1) IS NULL  ...

to

IF coalesce(array_length(arr, 1), 0) = 0  ...

There is also a change in the way array subscript assignment works.
Previously it wasn't possible to make a distinction between assigning
to an empty array and assigning to a NULL, so in either case an
expression like "arr[4] := 1" would create "[4:4]={1}".  Now there is
a distinction.  If you assign to an empty array you will get "{NULL,
NULL, NULL, 1}", whereas if you assign to a NULL you'll get
"[4:4]={1}".

Regression tests:
The regression tests were updated to reflect behavioural changes.

Documentation:
A minor update to the prose in chapter 8.15 is included in the patch.

Issues:
Fixed-length arrays (like oidvector) are zero-based, which means that
they are rendered into string form with their dimension info shown.
So an empty oidvector now renders as "[0:-1]={}", which is correct but
ugly.  I'm not delighted with this side-effect but I'm not sure what
can be done about it.  I'm open to ideas.

Diffstat:
 doc/src/sgml/array.sgml                         |   4 +-
 src/backend/executor/execQual.c                 |  77 +++++-
 src/backend/utils/adt/array_userfuncs.c         |  23 +-
 src/backend/utils/adt/arrayfuncs.c              | 671
+++++++++++++++++++++++++----------------------
 src/backend/utils/adt/arrayutils.c              |   4 +
 src/backend/utils/adt/xml.c                     |  19 +-
 src/include/c.h                                 |   1 +
 src/include/utils/array.h                       |   4 +
 src/pl/plpython/plpy_typeio.c                   |   3 -
 src/test/isolation/expected/timeouts.out        |  16 +-
 src/test/isolation/specs/timeouts.spec          |   8 +-
 src/test/regress/expected/arrays.out            |  55 ++--
 src/test/regress/expected/create_function_3.out |   2 +-
 src/test/regress/expected/polymorphism.out      |   2 +-
 src/test/regress/sql/arrays.sql                 |   6 +-
 15 files changed, 519 insertions(+), 376 deletions(-)


I'll add this to the Commit Fest app.  It is intended to resolve TODO
item "Improve handling of empty arrays".

Cheers,
BJ

Attachment
On Mar 20, 2013, at 4:45 PM, Brendan Jurd <direvus@gmail.com> wrote:

> I submit a patch to rectify the weird and confusing quirk of Postgres
> to use "zero dimensions" to signify an empty array.

Epic. Thank you. I’m very glad now that I complained about this (again)!

Best,

David




On 03/20/2013 04:45 PM, Brendan Jurd wrote:
> Incompatibility:
> This patch introduces an incompatible change in the behaviour of the
> aforementioned array functions -- instead of returning NULL for empty
> arrays they return meaningful values.  Applications relying on the old
> behaviour to test for emptiness may be disrupted.  One can

As a heavy user of arrays, I support this patch as being worth the
breakage of backwards compatibility.  However, that means it certainly
will need to wait for 9.4 to provide adequate warning.

Brendan, how hard would it be to create a GUC for backwards-compatible
behavior?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



On 25 March 2013 13:02, Josh Berkus <josh@agliodbs.com> wrote:
> On 03/20/2013 04:45 PM, Brendan Jurd wrote:
>> Incompatibility:
>> This patch introduces an incompatible change in the behaviour of the
>> aforementioned array functions -- instead of returning NULL for empty
>> arrays they return meaningful values.  Applications relying on the old
>> behaviour to test for emptiness may be disrupted.  One can
>
> As a heavy user of arrays, I support this patch as being worth the
> breakage of backwards compatibility.  However, that means it certainly
> will need to wait for 9.4 to provide adequate warning.

Thanks Josh, I appreciate the support.

>
> Brendan, how hard would it be to create a GUC for backwards-compatible
> behavior?

Good idea.  I don't know how hard it would be (never messed with GUCs
before), but I'll take a swing at it and see how it works out.

Cheers,
BJ



Brendan Jurd <direvus@gmail.com> writes:
> On 25 March 2013 13:02, Josh Berkus <josh@agliodbs.com> wrote:
>> Brendan, how hard would it be to create a GUC for backwards-compatible
>> behavior?

> Good idea.

No, it *isn't* a good idea.  GUCs that change application-visible
semantics are dangerous.  We should have learned this lesson by now.
        regards, tom lane



On 26 March 2013 00:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Brendan Jurd <direvus@gmail.com> writes:
>> On 25 March 2013 13:02, Josh Berkus <josh@agliodbs.com> wrote:
>>> Brendan, how hard would it be to create a GUC for backwards-compatible
>>> behavior?
>
>> Good idea.
>
> No, it *isn't* a good idea.  GUCs that change application-visible
> semantics are dangerous.  We should have learned this lesson by now.
>

They are?  Well okay then.  I'm not deeply attached to the GUC thing,
it just seemed like a friendly way to ease the upgrade path.  But if
it's dangerous somehow I'm happy to drop it.

Cheers,
BJ



Brendan Jurd <direvus@gmail.com> writes:
> On 26 March 2013 00:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> No, it *isn't* a good idea.  GUCs that change application-visible
>> semantics are dangerous.  We should have learned this lesson by now.

> They are?

Yeah, they are, because things break when they're set wrong.

Sometimes we have to have one anyway, but AFAICS this patch is just
proposing to change corner cases that most apps don't hit.  I think
we'd be better off without a knob for it.
        regards, tom lane



On 21 March 2013 10:45, Brendan Jurd <direvus@gmail.com> wrote:
>  src/test/isolation/expected/timeouts.out        |  16 +-
>  src/test/isolation/specs/timeouts.spec          |   8 +-

Oops, looks like some unrelated changes made their way into the
original patch.  Apologies.  Here's a -v2 patch, sans stowaways.

Cheers,
BJ

Attachment
Tom,

> No, it *isn't* a good idea.  GUCs that change application-visible
> semantics are dangerous.  We should have learned this lesson by now.

Really?  I thought that standard_conforming_strings was a great example
of how to ease our users into a backwards-compatibility break.   My
thought was that we change the behavior in 9.4, provide a
backwards-compatible GUC with warnings in the logs for two versions, and
then take the GUC away.

Whether that's worth the effort, though, is a question of how many
people are going to be affected by the compatibility break.  Maybe I can
do a survey.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



On Mon, Mar 25, 2013 at 5:14 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Tom,
>
>> No, it *isn't* a good idea.  GUCs that change application-visible
>> semantics are dangerous.  We should have learned this lesson by now.
>
> Really?  I thought that standard_conforming_strings was a great example
> of how to ease our users into a backwards-compatibility break.

It is, but it was made in full knowledge of the risks and costs.

What do you do if you're given two modules that have opposite
expectations for this variable? One module or the other might have
hidden unexpected bugs or even security holes if the variable is set
in a way it doesn't expect. You have to pick one value for the whole
database.

I'm not as sanguine as Tom is about how likely these corner cases will
be met actually. As far as I can tell checking IS NULL on
array_length() was the supported way to check for 0-length arrays
previously so I suspect a lot of applications will need to be updated.
But it's not a hard update to do and is the kind of update that's
often needed on major database upgrades.

At least if it's a clean break then everyone on 9.3 knows they need to
do IS NULL and everyone on 9.4 knows they need to check for 0. If
there's a GUC then people need to code defensively without knowing
which semantics they'll get. And if they test they need to test twice
under both settings just in case the user's database has the other
setting.

-- 
greg



On 26 March 2013 05:26, Greg Stark <stark@mit.edu> wrote:
> I'm not as sanguine as Tom is about how likely these corner cases will
> be met actually. As far as I can tell checking IS NULL on
> array_length() was the supported way to check for 0-length arrays
> previously

Correct.  There was no other way to check for empty arrays, because
every empty array was zero-D, and zero-D returns NULL from all array
interrogation functions, even array_ndims (which is totally bonkers).

>
> At least if it's a clean break then everyone on 9.3 knows they need to
> do IS NULL and everyone on 9.4 knows they need to check for 0.

And, as I pointed out in at the top of the thread, you can write
"coalesce(array_length(a,1), 0) = 0" if you want to be confident your
code will continue work in either case, and for some folks I expect
that will be the least painful way to upgrade from 9.3 -> 9.4.

Cheers,
BJ



On 03/25/2013 10:28 PM, Tom Lane wrote:
> Yeah, they are, because things break when they're set wrong.
They also make debugging and support harder; you need to get an
ever-growing list of GUC values from the user to figure out what their
query does. bytea_output, standard_conforming_strings, etc. Yick.

That said, I don't have a better answer for introducing non-BC changes.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services




On 2013.03.25 5:55 PM, Craig Ringer wrote:
> On 03/25/2013 10:28 PM, Tom Lane wrote:
>> Yeah, they are, because things break when they're set wrong.
> They also make debugging and support harder; you need to get an
> ever-growing list of GUC values from the user to figure out what their
> query does. bytea_output, standard_conforming_strings, etc. Yick.
>
> That said, I don't have a better answer for introducing non-BC changes.

Given the general trouble GUC values cause, is there a plan to deprecate and 
remove each of the existing ones over time?  As long as post-removal there isn't 
any actual loss of functionality, but users might have to change their code to 
do it "the one true way", that would seem a good thing. -- Darren Duncan




On 2013.03.25 6:03 PM, Darren Duncan wrote:
> On 2013.03.25 5:55 PM, Craig Ringer wrote:
>> On 03/25/2013 10:28 PM, Tom Lane wrote:
>>> Yeah, they are, because things break when they're set wrong.
>> They also make debugging and support harder; you need to get an
>> ever-growing list of GUC values from the user to figure out what their
>> query does. bytea_output, standard_conforming_strings, etc. Yick.
>>
>> That said, I don't have a better answer for introducing non-BC changes.
>
> Given the general trouble GUC values cause, is there a plan to deprecate and
> remove each of the existing ones over time?  As long as post-removal there isn't
> any actual loss of functionality, but users might have to change their code to
> do it "the one true way", that would seem a good thing. -- Darren Duncan

To clarify, I mean GUC related to backwards compatibility matters, such as 
bytea_output or standard_conforming_strings, things that affect the logical 
behavior of code.  I don't mean all GUC, not at all, most of the ones I know 
about should remain configurable. -- Darren Duncan




On Mon, Mar 25, 2013 at 10:14:15AM -0700, Josh Berkus wrote:
> Tom,
> 
> > No, it *isn't* a good idea.  GUCs that change application-visible
> > semantics are dangerous.  We should have learned this lesson by now.
> 
> Really?  I thought that standard_conforming_strings was a great example
> of how to ease our users into a backwards-compatibility break.   My
> thought was that we change the behavior in 9.4, provide a
> backwards-compatible GUC with warnings in the logs for two versions, and
> then take the GUC away.

standard_conforming_strings is not a good example because it took 5+
years to implement the change, and issued warnings about non-standard
use for several releases --- it is not a pattern to follow.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Bruce Momjian <bruce@momjian.us> writes:
> On Mon, Mar 25, 2013 at 10:14:15AM -0700, Josh Berkus wrote:
>>> No, it *isn't* a good idea.  GUCs that change application-visible
>>> semantics are dangerous.  We should have learned this lesson by now.

>> Really?  I thought that standard_conforming_strings was a great example
>> of how to ease our users into a backwards-compatibility break.   My
>> thought was that we change the behavior in 9.4, provide a
>> backwards-compatible GUC with warnings in the logs for two versions, and
>> then take the GUC away.

> standard_conforming_strings is not a good example because it took 5+
> years to implement the change, and issued warnings about non-standard
> use for several releases --- it is not a pattern to follow.

s_c_s was an example of the worst possible case: where the behavioral
change not merely breaks applications, but breaks them in a way that
creates easily-exploitable security holes.  We *had* to take that one
really slow, and issue warnings for several years beforehand (and IIRC,
there were still gripes from people who complained that we'd caused them
security problems).  I can't imagine that we'd go to that kind of
trouble for any less-sensitive behavioral change.
        regards, tom lane



On Sun, Mar 24, 2013 at 10:02 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 03/20/2013 04:45 PM, Brendan Jurd wrote:
>> Incompatibility:
>> This patch introduces an incompatible change in the behaviour of the
>> aforementioned array functions -- instead of returning NULL for empty
>> arrays they return meaningful values.  Applications relying on the old
>> behaviour to test for emptiness may be disrupted.  One can
>
> As a heavy user of arrays, I support this patch as being worth the
> breakage of backwards compatibility.  However, that means it certainly
> will need to wait for 9.4 to provide adequate warning.

I expect to lose this argument, but I think this is a terrible idea.
Users really hate it when they try to upgrade and find that they, uh,
can't, because of some application-level incompatibility like this.
They hate it twice as much when the change is essentially cosmetic.
There's no functional problems with arrays as they exist today that
this change would solve.

The way to make a change like this without breaking things for users
is to introduce a new type with different behavior and gradually
deprecate the old one.  Now, maybe it doesn't seem worth doing that
for a change this small.  But if so, I think that's evidence that this
isn't worth changing in the first place, not that it's worth changing
without regard for backwards-compatibility.

Personally, I think if we're going to start whacking around the
behavior here and risk inconveniencing our users, we ought to think a
little larger.  The fundamental thing that's dictating the current
behavior is that we have arrays of between 1 and 6 dimensions all
rolled up under a single data type.  But in many cases, if not nearly
all cases, what people want is, specifically, a one-dimensional array.If we were going to actually bite the bullet and
createseparate data
 
types for each possible number of array dimensions... and maybe fix
some other problems at the same time... then the effort involved in
ensuring backward-compatibility might seem like time better spent.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



2013/3/26 Robert Haas <robertmhaas@gmail.com>:
> On Sun, Mar 24, 2013 at 10:02 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> On 03/20/2013 04:45 PM, Brendan Jurd wrote:
>>> Incompatibility:
>>> This patch introduces an incompatible change in the behaviour of the
>>> aforementioned array functions -- instead of returning NULL for empty
>>> arrays they return meaningful values.  Applications relying on the old
>>> behaviour to test for emptiness may be disrupted.  One can
>>
>> As a heavy user of arrays, I support this patch as being worth the
>> breakage of backwards compatibility.  However, that means it certainly
>> will need to wait for 9.4 to provide adequate warning.
>
> I expect to lose this argument, but I think this is a terrible idea.
> Users really hate it when they try to upgrade and find that they, uh,
> can't, because of some application-level incompatibility like this.
> They hate it twice as much when the change is essentially cosmetic.
> There's no functional problems with arrays as they exist today that
> this change would solve.
>
> The way to make a change like this without breaking things for users
> is to introduce a new type with different behavior and gradually
> deprecate the old one.  Now, maybe it doesn't seem worth doing that
> for a change this small.  But if so, I think that's evidence that this
> isn't worth changing in the first place, not that it's worth changing
> without regard for backwards-compatibility.
>
> Personally, I think if we're going to start whacking around the
> behavior here and risk inconveniencing our users, we ought to think a
> little larger.  The fundamental thing that's dictating the current
> behavior is that we have arrays of between 1 and 6 dimensions all
> rolled up under a single data type.  But in many cases, if not nearly
> all cases, what people want is, specifically, a one-dimensional array.
>  If we were going to actually bite the bullet and create separate data
> types for each possible number of array dimensions... and maybe fix
> some other problems at the same time... then the effort involved in
> ensuring backward-compatibility might seem like time better spent.
>

I understand, but I don't agree. W have to fix impractical design of
arrays early. A ARRAY is 1st class - so there is not possible to use
varchar2 trick.

if we don't would to use GUC, what do you think about compatible
extension? We can overload a system functions behave. This can solve a
problem with updates and migrations.

Regards

Pavel


> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



On 26 March 2013 22:57, Robert Haas <robertmhaas@gmail.com> wrote:
> They hate it twice as much when the change is essentially cosmetic.
> There's no functional problems with arrays as they exist today that
> this change would solve.
>

We can't sensibly test for whether an array is empty.  I'd call that a
functional problem.

The NULL return from array_{length,lower,upper,ndims} is those
functions' way of saying their arguments failed a sanity check.  So we
cannot distinguish in a disciplined way between a valid, empty array,
and bad arguments.  If the zero-D implementation had been more
polished and say, array_ndims returned zero, we had provided an
array_empty function, or the existing functions threw errors for silly
arguments instead of returning NULL, then I'd be more inclined to see
your point.  But as it stands, the zero-D implementation has always
been half-baked and slightly broken, we just got used to working
around it.

Cheers,
BJ



> I expect to lose this argument, but I think this is a terrible idea.
> Users really hate it when they try to upgrade and find that they, uh,
> can't, because of some application-level incompatibility like this.
> They hate it twice as much when the change is essentially cosmetic.
> There's no functional problems with arrays as they exist today that
> this change would solve.

Sure there is.   How do you distinguish between an array which is NULL
and an array which is empty?

Also, the whole array_dims is NULL thing trips up pretty much every
single PG user who uses arrays for the first time.  I'd expect when we
announce the fix, we'll find that many users where doing the wrong thing
in the first place and didn't know why it wasn't working.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



On Tue, Mar 26, 2013 at 9:02 AM, Brendan Jurd <direvus@gmail.com> wrote:
> On 26 March 2013 22:57, Robert Haas <robertmhaas@gmail.com> wrote:
>> They hate it twice as much when the change is essentially cosmetic.
>> There's no functional problems with arrays as they exist today that
>> this change would solve.
>
> We can't sensibly test for whether an array is empty.  I'd call that a
> functional problem.

Sure you can.  Equality comparisons work just fine.

rhaas=# select '{}'::int4[] = '{}'::int4[];?column?
----------t
(1 row)

rhaas=# select '{}'::int4[] = '{1}'::int4[];?column?
----------f
(1 row)

> The NULL return from array_{length,lower,upper,ndims} is those
> functions' way of saying their arguments failed a sanity check.  So we
> cannot distinguish in a disciplined way between a valid, empty array,
> and bad arguments.  If the zero-D implementation had been more
> polished and say, array_ndims returned zero, we had provided an
> array_empty function, or the existing functions threw errors for silly
> arguments instead of returning NULL, then I'd be more inclined to see
> your point.  But as it stands, the zero-D implementation has always
> been half-baked and slightly broken, we just got used to working
> around it.

Well, you could easily change array_ndims() to error out if ARR_NDIM()
is negative or more than MAXDIM and return NULL only if it's exactly
0.  That wouldn't break backward compatibility because it would throw
an error only if fed a value that shouldn't ever exist in the first
place, short of a corrupted database.  I imagine the other functions
are amenable to similar treatment.

And if neither that nor just comparing against an empty array literal
floats your boat, adding an array_is_empty() function would let you
test for this condition without breaking backward compatibility, too.
That's overkill, I think, but it would work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On 27 March 2013 06:47, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Mar 26, 2013 at 9:02 AM, Brendan Jurd <direvus@gmail.com> wrote:
>> We can't sensibly test for whether an array is empty.  I'd call that a
>> functional problem.
>
> Sure you can.  Equality comparisons work just fine.
>
> rhaas=# select '{}'::int4[] = '{}'::int4[];

The good news is, if anybody out there is using that idiom to test for
emptiness, they will not be disrupted by the change.

Cheers,
BJ



Robert Haas <robertmhaas@gmail.com> writes:
> Well, you could easily change array_ndims() to error out if ARR_NDIM()
> is negative or more than MAXDIM and return NULL only if it's exactly
> 0.  That wouldn't break backward compatibility because it would throw
> an error only if fed a value that shouldn't ever exist in the first
> place, short of a corrupted database.  I imagine the other functions
> are amenable to similar treatment.

I haven't looked at the patch in detail, but I thought one of the key
changes was that '{}' would now be interpreted as a zero-length 1-D
array rather than a zero-D array.  If we do that it seems a bit moot
to argue about whether we should exactly preserve backwards-compatible
behavior in array_ndims(), because the input it's looking at won't be
the same anymore anyway.

In any case, the entire point of this proposal is that the current
behavior around zero-D arrays is *broken* and we don't want to be
backwards-compatible with it anymore.  So if you wish to argue against
that opinion, do so; but it seems a bit beside the point to simply
complain that backwards compatibility is being lost.
        regards, tom lane



On 26 March 2013 20:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Well, you could easily change array_ndims() to error out if ARR_NDIM()
>> is negative or more than MAXDIM and return NULL only if it's exactly
>> 0.  That wouldn't break backward compatibility because it would throw
>> an error only if fed a value that shouldn't ever exist in the first
>> place, short of a corrupted database.  I imagine the other functions
>> are amenable to similar treatment.
>
> I haven't looked at the patch in detail, but I thought one of the key
> changes was that '{}' would now be interpreted as a zero-length 1-D
> array rather than a zero-D array.  If we do that it seems a bit moot
> to argue about whether we should exactly preserve backwards-compatible
> behavior in array_ndims(), because the input it's looking at won't be
> the same anymore anyway.
>

The patch is also allowing '{{},{},{}}' which is described up-thread
as a 2-D empty array. That's pretty misleading, since it has length 3
(in the first dimension). '{{},{}}' and '{{}}' are both "more empty",
but neither is completely empty. It feels as though, if you were going
down that road, then for completeness you'd need a new syntax for a
truly empty 2-D array, e.g., '{}^2', but I can't really think of a
use-case for that, or for any of the others for that matter.

You'd be making it possible to have multiple different 2-D arrays, all
empty in the sense of having no elements, but which would compare
differently. Pretty much all you would be able to do with such arrays
would be to append additional empty arrays.


> In any case, the entire point of this proposal is that the current
> behavior around zero-D arrays is *broken* and we don't want to be
> backwards-compatible with it anymore.  So if you wish to argue against
> that opinion, do so; but it seems a bit beside the point to simply
> complain that backwards compatibility is being lost.
>

I'm not saying that the current situation is not broken. I'm just
questioning whether the fix is actually any less confusing than what
we have now.

+1 for adding an array_is_empty() function though
--- I think there is enough evidence just in this thread that the
current API is confusing. We don't currently provide a definitive way
to test for empty arrays, so people inevitably invent their own (all
subtly different) solutions.

Regards,
Dean



On 28 March 2013 00:21, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> The patch is also allowing '{{},{},{}}' which is described up-thread
> as a 2-D empty array. That's pretty misleading, since it has length 3
> (in the first dimension). '{{},{}}' and '{{}}' are both "more empty",
> but neither is completely empty.

> I'm not saying that the current situation is not broken. I'm just
> questioning whether the fix is actually any less confusing than what
> we have now.

Well the fix is primarily about 1-D empty arrays, and in that respect
it is much less confusing than what we have now.  As for multidim
arrays, I don't use them in the field, so I don't have a strong
opinion about how (or even whether) we should support empty multidim.
I included the '{{}}' syntax following comments from Tom that we
should consider supporting that if we were to get rid of zero-D, but I
don't really have any skin in that game.

Since we require multidim arrays to be regular, perhaps they would
need extents in all dimensions to be practically useful ... if you
wanted to define a blank tic-tac-toe board using a 2-D array, for
example, you'd probably define it as 3x3 with NULL values in each
position, rather than trying to make it "empty".

Cheers,
BJ



On 27 March 2013 17:14, Brendan Jurd <direvus@gmail.com> wrote:
> On 28 March 2013 00:21, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>> The patch is also allowing '{{},{},{}}' which is described up-thread
>> as a 2-D empty array. That's pretty misleading, since it has length 3
>> (in the first dimension). '{{},{}}' and '{{}}' are both "more empty",
>> but neither is completely empty.
>
>> I'm not saying that the current situation is not broken. I'm just
>> questioning whether the fix is actually any less confusing than what
>> we have now.
>
> Well the fix is primarily about 1-D empty arrays, and in that respect
> it is much less confusing than what we have now.

Maybe. But even in 1-D, it's still jumping from having one empty array
to infinitely many starting at different indexes, e.g., '{}'::int[] !=
'[4:3]={}'::int[]. There may be a certain logic to that, but I'm not
convinced about its usefulness.

Also, it is incompatible with the choice made for empty ranges, which
are all normalised to a single unique empty range value --- the empty
set, with no start- or end-points. I find that quite logical, and so
to me, it makes most sense to also have a single unique empty array,
which is then necessarily dimensionless.

Regards,
Dean



On 28 March 2013 09:39, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> On 27 March 2013 17:14, Brendan Jurd <direvus@gmail.com> wrote:
>> Well the fix is primarily about 1-D empty arrays, and in that respect
>> it is much less confusing than what we have now.
>
> Maybe. But even in 1-D, it's still jumping from having one empty array
> to infinitely many starting at different indexes, e.g., '{}'::int[] !=
> '[4:3]={}'::int[]. There may be a certain logic to that, but I'm not
> convinced about its usefulness.

We already have the ability to define lower bounds other than 1 on
arrays, and it would be inconsistent to allow that for arrays with
elements, but not for arrays without.  I could imagine somebody
wanting to create an empty zero-based array, and then iteratively
append elements to it.

> Also, it is incompatible with the choice made for empty ranges,

To me it doesn't make sense to try to draw parallels between arrays
and ranges, they really are completely different things.

Cheers,
BJ



Brendan Jurd <direvus@gmail.com> writes:
> On 28 March 2013 09:39, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>> Maybe. But even in 1-D, it's still jumping from having one empty array
>> to infinitely many starting at different indexes, e.g., '{}'::int[] !=
>> '[4:3]={}'::int[]. There may be a certain logic to that, but I'm not
>> convinced about its usefulness.

> We already have the ability to define lower bounds other than 1 on
> arrays, and it would be inconsistent to allow that for arrays with
> elements, but not for arrays without.

Yeah, if '[1:1]={0}'::int[] is distinct from '[2:2]={0}'::int[],
it's a bit hard to argue that '[1:0]={}'::int[] must not be
distinct from '[2:1]={}'::int[].  If we were doing this from scratch
we might drop the whole notion of nondefault lower bounds, but that
ship sailed ages ago.
        regards, tom lane



> We already have the ability to define lower bounds other than 1 on
> arrays, and it would be inconsistent to allow that for arrays with
> elements, but not for arrays without.  I could imagine somebody
> wanting to create an empty zero-based array, and then iteratively
> append elements to it.

Hmmm.  You know, I think that's why we originally had array_upper() on
an empty array return NULL, not 0.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



On 28 March 2013 00:04, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Brendan Jurd <direvus@gmail.com> writes:
>> On 28 March 2013 09:39, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>>> Maybe. But even in 1-D, it's still jumping from having one empty array
>>> to infinitely many starting at different indexes, e.g., '{}'::int[] !=
>>> '[4:3]={}'::int[]. There may be a certain logic to that, but I'm not
>>> convinced about its usefulness.
>
>> We already have the ability to define lower bounds other than 1 on
>> arrays, and it would be inconsistent to allow that for arrays with
>> elements, but not for arrays without.
>
> Yeah, if '[1:1]={0}'::int[] is distinct from '[2:2]={0}'::int[],
> it's a bit hard to argue that '[1:0]={}'::int[] must not be
> distinct from '[2:1]={}'::int[].  If we were doing this from scratch
> we might drop the whole notion of nondefault lower bounds, but that
> ship sailed ages ago.
>

You could make the exact same argument for ranges --- if
'[1,1]'::int4range is distinct from '[2,2]'::int4range, why isn't
'[1,1)'::int4range distinct from '[2,2)'::int4range?

I see ranges and arrays as very closely related because the extents of
an array are an integer range, and the extents of an empty array are
an empty range. Moreover, they have almost identical API functions.

There are two internally self-consistent models for handling empty
ranges/arrays --- one in which empty ranges/arrays are considered not
to have lower/upper bounds, and one in which they are. In the first
model, there is only one empty range/array. In the second, there are
infinitely many, all different. Both models can be written in a
consistent way, but what seems inconsistent is to choose one model for
ranges, and change to a different model for arrays.

Regards,
Dean



Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On 28 March 2013 00:04, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, if '[1:1]={0}'::int[] is distinct from '[2:2]={0}'::int[],
>> it's a bit hard to argue that '[1:0]={}'::int[] must not be
>> distinct from '[2:1]={}'::int[].

> You could make the exact same argument for ranges --- if
> '[1,1]'::int4range is distinct from '[2,2]'::int4range, why isn't
> '[1,1)'::int4range distinct from '[2,2)'::int4range?

Because an array's bounds are a core piece of the identity of the
array.  In another universe PG's arrays might not have been defined
like that, but as things stand, they are.  We chose to define ranges
differently.  That's okay, because ranges *are not arrays*.  If they
were the same, we'd not have needed to invent a separate concept.

> I see ranges and arrays as very closely related because the extents of
> an array are an integer range, and the extents of an empty array are
> an empty range. Moreover, they have almost identical API functions.

I think this argument has about as much validity as claiming that
because the subscripts of an array are integers, arrays should behave
like integers.
        regards, tom lane



On 28 March 2013 03:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> [snip]
> ranges *are not arrays*.

OK, fair enough. I guess it's the mathematician in me seeing patterns
in things that behave similarly, but which are admittedly different.

Is the patch also going to allow empty arrays in higher dimensions
where not just the last dimension is empty? It seems as though, if
it's allowing 1-by-0 arrays like '{{}}' and '[4:4][8:7]={{}}', it
should also allow 0-by-0 arrays like '[4:3][8:7]={}', and 0-by-3
arrays like '[4:3][11:13]={}'.

That last example seems like the more useful kind of thing to allow,
since you might one day be able to append a non-empty 1-D array onto
it. As it stands, the patch only allows empty 2-D arrays that are
empty in the final dimension, to which the only thing you could append
would be more empty 1-D arrays.

Regards,
Dean



On 28 March 2013 20:34, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> Is the patch also going to allow empty arrays in higher dimensions
> where not just the last dimension is empty?

It doesn't allow that at present.

> It seems as though, if
> it's allowing 1-by-0 arrays like '{{}}' and '[4:4][8:7]={{}}', it
> should also allow 0-by-0 arrays like '[4:3][8:7]={}', and 0-by-3
> arrays like '[4:3][11:13]={}'.

I think the array literal parser would reject this on the grounds that
the brace structure doesn't match the dimension specifiers.  You could
modify that check to respect zero length in dimensions other than the
innermost one, but it's hard to say whether it would be worth the
effort.

It might be instructive to hear from somebody who does (or intends to)
use multidim arrays for some practical purpose, but I don't even know
whether such a person exists within earshot of this list.

Cheers,
BJ



Re: [PATCH] Exorcise "zero-dimensional" arrays

From
Dimitri Fontaine
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I understand, but I don't agree. W have to fix impractical design of
> arrays early. A ARRAY is 1st class - so there is not possible to use
> varchar2 trick.
>
> if we don't would to use GUC, what do you think about compatible
> extension? We can overload a system functions behave. This can solve a
> problem with updates and migrations.

In Common Lisp arrays are multi-dimensional too, and the 1-D arrays have
a specific name, they call them vectors. If you create an array with
only 1 dimension, then it's a vector. All the array API works the same
on vectors.

I guess we could specialize some of our API on vectors and avoid having
to tell the system which dimension we are interested into when using
them rather than multi-dimensional arrays.

Now from the internals perspective I also guess we don't want to be
generating so many more types so maybe we would need some tricks to know
how to promote a 1-D array into a vector automatically?

-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: [PATCH] Exorcise "zero-dimensional" arrays

From
Pavel Stehule
Date:
I am not sure if CL is best example - it has no NULL SQL logic.

Current issue is introduced by design, where empty array is similar to NULL (but not same) - what is on half way to Oracle varchar behave. But it is not consistent with other data types in postgres.

Regards

Pavel



2013/3/31 Dimitri Fontaine <dimitri@2ndquadrant.fr>
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I understand, but I don't agree. W have to fix impractical design of
> arrays early. A ARRAY is 1st class - so there is not possible to use
> varchar2 trick.
>
> if we don't would to use GUC, what do you think about compatible
> extension? We can overload a system functions behave. This can solve a
> problem with updates and migrations.

In Common Lisp arrays are multi-dimensional too, and the 1-D arrays have
a specific name, they call them vectors. If you create an array with
only 1 dimension, then it's a vector. All the array API works the same
on vectors.

I guess we could specialize some of our API on vectors and avoid having
to tell the system which dimension we are interested into when using
them rather than multi-dimensional arrays.

Now from the internals perspective I also guess we don't want to be
generating so many more types so maybe we would need some tricks to know
how to promote a 1-D array into a vector automatically?

--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

On Tue, Mar 26, 2013 at 4:39 PM, Brendan Jurd <direvus@gmail.com> wrote:
> On 27 March 2013 06:47, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Mar 26, 2013 at 9:02 AM, Brendan Jurd <direvus@gmail.com> wrote:
>>> We can't sensibly test for whether an array is empty.  I'd call that a
>>> functional problem.
>>
>> Sure you can.  Equality comparisons work just fine.
>>
>> rhaas=# select '{}'::int4[] = '{}'::int4[];
>
> The good news is, if anybody out there is using that idiom to test for
> emptiness, they will not be disrupted by the change.

According to the discussion downthread, apparently they will, because
you're introducing an infinitude of empty arrays, not all of which
compare equal to '{}'::int4.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Tue, Mar 26, 2013 at 4:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Well, you could easily change array_ndims() to error out if ARR_NDIM()
>> is negative or more than MAXDIM and return NULL only if it's exactly
>> 0.  That wouldn't break backward compatibility because it would throw
>> an error only if fed a value that shouldn't ever exist in the first
>> place, short of a corrupted database.  I imagine the other functions
>> are amenable to similar treatment.
>
> I haven't looked at the patch in detail, but I thought one of the key
> changes was that '{}' would now be interpreted as a zero-length 1-D
> array rather than a zero-D array.  If we do that it seems a bit moot
> to argue about whether we should exactly preserve backwards-compatible
> behavior in array_ndims(), because the input it's looking at won't be
> the same anymore anyway.
>
> In any case, the entire point of this proposal is that the current
> behavior around zero-D arrays is *broken* and we don't want to be
> backwards-compatible with it anymore.  So if you wish to argue against
> that opinion, do so; but it seems a bit beside the point to simply
> complain that backwards compatibility is being lost.

I don't think the current behavior is broken.  I found it
counterintuitive at first, but then I got used to it.  It's reasonably
self-consistent: arrays can't have empty dimensions, therefore the
empty array is unique and dimensionless.  Is that the behavior I would
have picked if I had designed the type?  No, it isn't.  I wouldn't
have tried to support one-dimensional arrays and multi-dimensional
arrays in the same data type either, nor would I have supported
non-default lower bounds.  But all of those ships have sailed, and the
time to change them is not after people have spent 10 years building
applications that work with the current behavior.  If we want to
introduce a new type with different, perhaps better, behavior, well, I
think that might be a fine idea.  But I *don't* think imposing a hard
compatibility break on users of arrays is a good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Mon, Apr 1, 2013 at 6:43 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Mar 26, 2013 at 4:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Well, you could easily change array_ndims() to error out if ARR_NDIM()
>>> is negative or more than MAXDIM and return NULL only if it's exactly
>>> 0.  That wouldn't break backward compatibility because it would throw
>>> an error only if fed a value that shouldn't ever exist in the first
>>> place, short of a corrupted database.  I imagine the other functions
>>> are amenable to similar treatment.
>>
>> I haven't looked at the patch in detail, but I thought one of the key
>> changes was that '{}' would now be interpreted as a zero-length 1-D
>> array rather than a zero-D array.  If we do that it seems a bit moot
>> to argue about whether we should exactly preserve backwards-compatible
>> behavior in array_ndims(), because the input it's looking at won't be
>> the same anymore anyway.
>>
>> In any case, the entire point of this proposal is that the current
>> behavior around zero-D arrays is *broken* and we don't want to be
>> backwards-compatible with it anymore.  So if you wish to argue against
>> that opinion, do so; but it seems a bit beside the point to simply
>> complain that backwards compatibility is being lost.
>
> I don't think the current behavior is broken.  I found it
> counterintuitive at first, but then I got used to it.  It's reasonably
> self-consistent: arrays can't have empty dimensions, therefore the
> empty array is unique and dimensionless.  Is that the behavior I would
> have picked if I had designed the type?  No, it isn't.  I wouldn't
> have tried to support one-dimensional arrays and multi-dimensional
> arrays in the same data type either, nor would I have supported
> non-default lower bounds.  But all of those ships have sailed, and the
> time to change them is not after people have spent 10 years building
> applications that work with the current behavior.  If we want to
> introduce a new type with different, perhaps better, behavior, well, I
> think that might be a fine idea.  But I *don't* think imposing a hard
> compatibility break on users of arrays is a good idea.

100% agree.  Also huge +1 on your backwards compatibility comments
upthread -- couldn't agree more.  My $company just wrapped up a one
year porting effort to 9.2 from *8.1* due to compatibility issues.

If you want custom array behaviors, creating a type is probably the
best way unless it can be 100% proven that this will not break code.

merlin



On 4/1/13 8:58 AM, Merlin Moncure wrote:
> On Mon, Apr 1, 2013 at 6:43 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Mar 26, 2013 at 4:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>> Well, you could easily change array_ndims() to error out if ARR_NDIM()
>>>> is negative or more than MAXDIM and return NULL only if it's exactly
>>>> 0.  That wouldn't break backward compatibility because it would throw
>>>> an error only if fed a value that shouldn't ever exist in the first
>>>> place, short of a corrupted database.  I imagine the other functions
>>>> are amenable to similar treatment.
>>>
>>> I haven't looked at the patch in detail, but I thought one of the key
>>> changes was that '{}' would now be interpreted as a zero-length 1-D
>>> array rather than a zero-D array.  If we do that it seems a bit moot
>>> to argue about whether we should exactly preserve backwards-compatible
>>> behavior in array_ndims(), because the input it's looking at won't be
>>> the same anymore anyway.
>>>
>>> In any case, the entire point of this proposal is that the current
>>> behavior around zero-D arrays is *broken* and we don't want to be
>>> backwards-compatible with it anymore.  So if you wish to argue against
>>> that opinion, do so; but it seems a bit beside the point to simply
>>> complain that backwards compatibility is being lost.
>>
>> I don't think the current behavior is broken.  I found it
>> counterintuitive at first, but then I got used to it.  It's reasonably
>> self-consistent: arrays can't have empty dimensions, therefore the
>> empty array is unique and dimensionless.  Is that the behavior I would
>> have picked if I had designed the type?  No, it isn't.  I wouldn't
>> have tried to support one-dimensional arrays and multi-dimensional
>> arrays in the same data type either, nor would I have supported
>> non-default lower bounds.  But all of those ships have sailed, and the
>> time to change them is not after people have spent 10 years building
>> applications that work with the current behavior.  If we want to
>> introduce a new type with different, perhaps better, behavior, well, I
>> think that might be a fine idea.  But I *don't* think imposing a hard
>> compatibility break on users of arrays is a good idea.
>
> 100% agree.  Also huge +1 on your backwards compatibility comments
> upthread -- couldn't agree more.  My $company just wrapped up a one
> year porting effort to 9.2 from *8.1* due to compatibility issues.
>
> If you want custom array behaviors, creating a type is probably the
> best way unless it can be 100% proven that this will not break code.

+1, and FWIW I'd love to see a new, cleaner array implementation. My beefs with the current system:

- Adjustable lower bound is *way* more trouble than it's worth (does anyone actually have a use-case for supporting
it?)
- Need to be able to tell if an array is actually empty/null, which is not the same as an array where all elements are
null
- Enforced dimensions would be nice, though I can't say it's that big a deal
- Not sure that special casing 1 dimension arrays would be worth it... perhaps enforced dimensions solve whatever
problemsexist there
 



On 1 April 2013 21:57, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Mar 26, 2013 at 4:39 PM, Brendan Jurd <direvus@gmail.com> wrote:
>> On 27 March 2013 06:47, Robert Haas <robertmhaas@gmail.com> wrote:
>>> rhaas=# select '{}'::int4[] = '{}'::int4[];
>>
>> The good news is, if anybody out there is using that idiom to test for
>> emptiness, they will not be disrupted by the change.
>
> According to the discussion downthread, apparently they will, because
> you're introducing an infinitude of empty arrays, not all of which
> compare equal to '{}'::int4.

It is not possible to construct e.g. '[3:2]={}' or '{{}, {}}' in
existing applications, so there is no way for that idiom in existing
applications to be broken by upgrading.  If testing for equality with
'{}' works now, it will also work post-upgrade.

The only way for it to stop working is if somebody upgrades, and
*then* goes out of their way to create an empty array with nondefault
lower bounds, and then tries to compare that array against the empty
array with default lower bounds, to test for emptiness.  Which would
be silly.

Big picture: A very large number of users wouldn't be using arrays at
all, and of those who are, probably a vanishingly small number
(perhaps zero) care about how emptiness interacts with multiple
dimensions or nondefault lower bounds.  We're talking about a corner
case inside a corner case here.

For most folks, this upgrade would break nothing.  A few (myself
included) will want to grep their code for
array_(lower|upper|length|dims) call sites and maybe make some tweaks.

Cheers,
BJ



On Mon, Apr 1, 2013 at 6:40 PM, Brendan Jurd <direvus@gmail.com> wrote:
> It is not possible to construct e.g. '[3:2]={}' or '{{}, {}}' in
> existing applications, so there is no way for that idiom in existing
> applications to be broken by upgrading.  If testing for equality with
> '{}' works now, it will also work post-upgrade.

I suppose, strictly speaking, that that is true, but I think there are
all kinds of ways for this to go wrong.  Your previous complaint was
that there was no easy way to check whether an array is empty.  I
pointed out that there is: compare it against an empty array and see
whether they are equal.  Now you're saying, let's change it so we can
have lots of different empty arrays, each one different from all the
others.  Well, that breaks the world for anybody who is using the
method that works today.  Sure, they've probably got to have at least
some kind of application change before the wheels really start to come
off, but as soon as some array that's empty but not equal to {} creeps
into their application by whatever means, they've got trouble.  And
what's the upside?  Do we really want or need an infinitude of
all-distinct empty arrays?

Besides, even if you're not convinced that that particular thing will
break applications, there are clearly going to be cases where some
array functions return different results after this change than they
did beforehand.  That's actually the whole point of this change.
Right?  And, if you change the value that a built-in function returns
when given the same input, that can break user applications.

> The only way for it to stop working is if somebody upgrades, and
> *then* goes out of their way to create an empty array with nondefault
> lower bounds, and then tries to compare that array against the empty
> array with default lower bounds, to test for emptiness.  Which would
> be silly.
>
> Big picture: A very large number of users wouldn't be using arrays at
> all, and of those who are, probably a vanishingly small number
> (perhaps zero) care about how emptiness interacts with multiple
> dimensions or nondefault lower bounds.  We're talking about a corner
> case inside a corner case here.

I think the first half of this argument is off-point, because it's
true in general, for any feature, that people who aren't using a
feature won't notice if the behavior of that feature changes.  But we
have good evidence that even obscure, largely broken features like
RULEs have user bases that are significant enough that changing the
behavior of those systems will generate protest.  I'd be surprised if
the number of people using arrays isn't far larger than the number of
people using rules for table partitioning.

As to the second half, part of the point of this patch is to change
things so that array_length('{}'::int4[], 1) returns non-NULL.  But I
myself have written code that tests for an empty array by writing
array_length(blahblah, 1) IS NULL.  So with your proposed change, that
code would break.  Sure, I would have been happier, when I wrote that
code, if the return value had been 0 rather than NULL.  But the return
value *was* NULL, so I wrote the code that way, and it worked.  I
can't be the only one to have done something like that.  Anyone who
has would get an ugly surprise if we did what you're proposing.  In a
large application, it might not even be clear (until you started
poking at it) what exactly had changed; and if they've got tens or
hundreds of thousands of lines of code (hardly unreasonable), fixing
it might be quite a chore.  Imposing that pain our user base does not
seem justified to me.

> For most folks, this upgrade would break nothing.  A few (myself
> included) will want to grep their code for
> array_(lower|upper|length|dims) call sites and maybe make some tweaks.

I think the only people for whom nothing will break are the people who
aren't using arrays in the first place.  Anyone who is is likely to
have dependencies on the way array_lower/upper work today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Apr 1, 2013, at 4:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:

> I think the only people for whom nothing will break are the people who
> aren't using arrays in the first place.  Anyone who is is likely to
> have dependencies on the way array_lower/upper work today.

Well, what if we add new functions that return 0 for empty arrays, but leave the existing ones alone? Perhaps call them
array_size(),array_first_index(), and array_last_index(). Then nothing has to break, and we can decide independently if
wewant to deprecate the older functions in a future release. Or not. 

Best,

David




On 2 April 2013 10:59, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Apr 1, 2013 at 6:40 PM, Brendan Jurd <direvus@gmail.com> wrote:
>> It is not possible to construct e.g. '[3:2]={}' or '{{}, {}}' in
>> existing applications, so there is no way for that idiom in existing
>> applications to be broken by upgrading.  If testing for equality with
>> '{}' works now, it will also work post-upgrade.
>
> Sure, they've probably got to have at least
> some kind of application change before the wheels really start to come
> off, but as soon as some array that's empty but not equal to {} creeps
> into their application by whatever means, they've got trouble.

Constructing an array with nondefault bounds isn't happening by
accident, you'd have to know about, and deliberately make use of, the
obscure '[m:n]={}' syntax for array literals.  How is it going to
"creep in"?

I note the total absence of people rending their garments over the
fact that '{foo}' does not equal '[2:2]={foo}'.

Cheers,
BJ



On 2 April 2013 11:34, David E. Wheeler <david@kineticode.com> wrote:
> On Apr 1, 2013, at 4:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
>> I think the only people for whom nothing will break are the people who
>> aren't using arrays in the first place.  Anyone who is is likely to
>> have dependencies on the way array_lower/upper work today.
>
> Well, what if we add new functions that return 0 for empty arrays, but leave the existing ones alone? Perhaps call
themarray_size(), array_first_index(), and array_last_index(). Then nothing has to break, and we can decide
independentlyif we want to deprecate the older functions in a future release. Or not. 

I think having an 'array_size' and an 'array_length' that behave
differently would be legitimately confusing, and I can't think of any
alternative function name that would concisely explain the difference
in behaviour -- 'array_length_without_the_stupid_nulls' is just too
long.

Cheers,
BJ



Hello


2013/4/2 Brendan Jurd <direvus@gmail.com>
On 2 April 2013 11:34, David E. Wheeler <david@kineticode.com> wrote:
> On Apr 1, 2013, at 4:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
>> I think the only people for whom nothing will break are the people who
>> aren't using arrays in the first place.  Anyone who is is likely to
>> have dependencies on the way array_lower/upper work today.
>
> Well, what if we add new functions that return 0 for empty arrays, but leave the existing ones alone? Perhaps call them array_size(), array_first_index(), and array_last_index(). Then nothing has to break, and we can decide independently if we want to deprecate the older functions in a future release. Or not.

I think having an 'array_size' and an 'array_length' that behave
differently would be legitimately confusing, and I can't think of any
alternative function name that would concisely explain the difference
in behaviour -- 'array_length_without_the_stupid_nulls' is just too
long.

yes,

we should to do complete redesign or do nothing.


Regards

Pavel

 

Cheers,
BJ


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

<div class="moz-cite-prefix">On 29/03/13 13:12, Brendan Jurd wrote:<br /></div><blockquote
cite="mid:CADxJZo3fXdFCNS_sMyAiBgPCzbGLnYwCdnQaXLqARPntD5MFJQ@mail.gmail.com"type="cite"><pre wrap="">On 28 March 2013
20:34,Dean Rasheed <a class="moz-txt-link-rfc2396E"
href="mailto:dean.a.rasheed@gmail.com"><dean.a.rasheed@gmail.com></a>wrote:
 
</pre><blockquote type="cite"><pre wrap="">Is the patch also going to allow empty arrays in higher dimensions
where not just the last dimension is empty?
</pre></blockquote><pre wrap="">
It doesn't allow that at present.

</pre><blockquote type="cite"><pre wrap="">It seems as though, if
it's allowing 1-by-0 arrays like '{{}}' and '[4:4][8:7]={{}}', it
should also allow 0-by-0 arrays like '[4:3][8:7]={}', and 0-by-3
arrays like '[4:3][11:13]={}'.
</pre></blockquote><pre wrap="">
I think the array literal parser would reject this on the grounds that
the brace structure doesn't match the dimension specifiers.  You could
modify that check to respect zero length in dimensions other than the
innermost one, but it's hard to say whether it would be worth the
effort.

It might be instructive to hear from somebody who does (or intends to)
use multidim arrays for some practical purpose, but I don't even know
whether such a person exists within earshot of this list.

Cheers,
BJ


</pre></blockquote><font size="-1">Multiple dimension arrays. 'might' be something I may need, or at least usefully use
inmy current p<font size="-1">roject</font></font>.<br /><br /> I am not ready to do the database design to that detail
yet.<br/><br /> However, I know I will need to do 'clever'(TM) things with multiple images areas on images, and
possiblygroup several images together (and still access their individual image areas).  So potentially arrays of
dimension2 or greater may be useful.  Certainly, some images will not have any image areas defined, some arrays might
beempty.<br /><br /><br /> Cheers,<br /> Gavin<br /><br /><br /> 
On Apr1, 2013, at 13:43 , Robert Haas <robertmhaas@gmail.com> wrote:
> I don't think the current behavior is broken.  I found it
> counterintuitive at first, but then I got used to it.  It's reasonably
> self-consistent: arrays can't have empty dimensions, therefore the
> empty array is unique and dimensionless.  Is that the behavior I would
> have picked if I had designed the type?  No, it isn't.  I wouldn't
> have tried to support one-dimensional arrays and multi-dimensional
> arrays in the same data type either, nor would I have supported
> non-default lower bounds.  But all of those ships have sailed, and the
> time to change them is not after people have spent 10 years building
> applications that work with the current behavior.  If we want to
> introduce a new type with different, perhaps better, behavior, well, I
> think that might be a fine idea.  But I *don't* think imposing a hard
> compatibility break on users of arrays is a good idea.

+1.

Especially since having infinitely many variants of "empty" is IMHO
hardly an improvement over the existing situation.

If we're going to break compatibility, we should IMHO get rid of
non-zero lower bounds all together. My guess is that the number of
affected users wouldn't be much higher than for the proposed patch,
and it'd allow lossless mapping to most language's native array types…

best regards,
Florian Pflug




On 04/02/2013 02:46 PM, Florian Pflug wrote:
> On Apr1, 2013, at 13:43 , Robert Haas <robertmhaas@gmail.com> wrote:
>> I don't think the current behavior is broken.  I found it
>> counterintuitive at first, but then I got used to it.  It's reasonably
>> self-consistent: arrays can't have empty dimensions, therefore the
>> empty array is unique and dimensionless.  Is that the behavior I would
>> have picked if I had designed the type?  No, it isn't.  I wouldn't
>> have tried to support one-dimensional arrays and multi-dimensional
>> arrays in the same data type either, nor would I have supported
>> non-default lower bounds.  But all of those ships have sailed, and the
>> time to change them is not after people have spent 10 years building
>> applications that work with the current behavior.  If we want to
>> introduce a new type with different, perhaps better, behavior, well, I
>> think that might be a fine idea.  But I *don't* think imposing a hard
>> compatibility break on users of arrays is a good idea.
> +1.
>
> Especially since having infinitely many variants of "empty" is IMHO
> hardly an improvement over the existing situation.
>
> If we're going to break compatibility, we should IMHO get rid of
> non-zero lower bounds all together. My guess is that the number of
> affected users wouldn't be much higher than for the proposed patch,
> and it'd allow lossless mapping to most language's native array types…


That would actually break a HUGE number of users, since the default
lower bound is 1. I have seen any number of pieces if code that rely on
that.

Doesn't the standard have something to say about all this?

cheers

andrew




On Apr3, 2013, at 15:30 , Andrew Dunstan <andrew@dunslane.net> wrote:
> On 04/02/2013 02:46 PM, Florian Pflug wrote:
>> If we're going to break compatibility, we should IMHO get rid of
>> non-zero lower bounds all together. My guess is that the number of
>> affected users wouldn't be much higher than for the proposed patch,
>> and it'd allow lossless mapping to most language's native array types…
>
> That would actually break a HUGE number of users, since the default lower
> bound is 1. I have seen any number of pieces if code that rely on that.

Uh, yeah, we should make it 1 then, not 0, then. As long as the bound
is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
be lossless.

best regards,
Florian Pflug




Andrew Dunstan <andrew@dunslane.net> writes:
> On 04/02/2013 02:46 PM, Florian Pflug wrote:
>> If we're going to break compatibility, we should IMHO get rid of
>> non-zero lower bounds all together. My guess is that the number of
>> affected users wouldn't be much higher than for the proposed patch,
>> and it'd allow lossless mapping to most language's native array types�

> That would actually break a HUGE number of users, since the default 
> lower bound is 1. I have seen any number of pieces if code that rely on 
> that.

I assume he meant non-one lower bounds.  But in any case, removing the
option for other lower bounds would be very clearly a removal of useful
functionality, so I don't see it happening ... certainly not if we're
so tied to bug-compatibility that we can't redefine the number of
dimensions an empty array has.

I think though that the upthread argument that we'd have multiple
interpretations of the same thing is bogus.  To me, the core idea that's
being suggested here is that '{}' should mean a zero-length 1-D array,
not a zero-D array as formerly.  We would still need a way to represent
zero-D arrays, if only because they'd still exist on-disk in existing
databases (assuming we're not willing to break pg_upgrade for this).
I suggest that that ought *not* involve any braces.  Perhaps '[]=' would
be a suitable representation.  In the other direction, ISTM that
'{{},{},{}}' is a zero-by-three array, entirely distinct from '{}' or
'{{}}' in dimensionality if not content.  I haven't worked this out in
complete detail, but if different strings mean the same thing then we
don't have the representation defined right.
        regards, tom lane



On 04/04/13 03:02, Florian Pflug wrote:
> On Apr3, 2013, at 15:30 , Andrew Dunstan <andrew@dunslane.net> wrote:
>> On 04/02/2013 02:46 PM, Florian Pflug wrote:
>>> If we're going to break compatibility, we should IMHO get rid of
>>> non-zero lower bounds all together. My guess is that the number of
>>> affected users wouldn't be much higher than for the proposed patch,
>>> and it'd allow lossless mapping to most language's native array types…
>> That would actually break a HUGE number of users, since the default lower
>> bound is 1. I have seen any number of pieces if code that rely on that.
> Uh, yeah, we should make it 1 then, not 0, then. As long as the bound
> is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
> be lossless.
>
> best regards,
> Florian Pflug
>
>
Zero as the default lower bound is consistent with most languages
(especially the common ones like C, C++, Java, & Python), in fact I
don't remember any language where that is not the case (ignoring SQL) -
and I've written programs in about 20 languages.

Maybe we should adopt the famous compromise of '0.5'?  :-)


Cheers,
Gavin







2013/4/3 Gavin Flower <GavinFlower@archidevsys.co.nz>
On 04/04/13 03:02, Florian Pflug wrote:
On Apr3, 2013, at 15:30 , Andrew Dunstan <andrew@dunslane.net> wrote:
On 04/02/2013 02:46 PM, Florian Pflug wrote:
If we're going to break compatibility, we should IMHO get rid of
non-zero lower bounds all together. My guess is that the number of
affected users wouldn't be much higher than for the proposed patch,
and it'd allow lossless mapping to most language's native array types…
That would actually break a HUGE number of users, since the default lower
bound is 1. I have seen any number of pieces if code that rely on that.
Uh, yeah, we should make it 1 then, not 0, then. As long as the bound
is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
be lossless.

best regards,
Florian Pflug


Zero as the default lower bound is consistent with most languages (especially the common ones like C, C++, Java, & Python), in fact I don't remember any language where that is not the case (ignoring SQL) - and I've written programs in about 20 languages.

pascal, ADA, and ALGOL like languages

Regards

Pavel


 

Maybe we should adopt the famous compromise of '0.5'?  :-)


Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

On 04/04/13 04:58, Pavel Stehule wrote:



2013/4/3 Gavin Flower <GavinFlower@archidevsys.co.nz>
On 04/04/13 03:02, Florian Pflug wrote:
On Apr3, 2013, at 15:30 , Andrew Dunstan <andrew@dunslane.net> wrote:
On 04/02/2013 02:46 PM, Florian Pflug wrote:
If we're going to break compatibility, we should IMHO get rid of
non-zero lower bounds all together. My guess is that the number of
affected users wouldn't be much higher than for the proposed patch,
and it'd allow lossless mapping to most language's native array types…
That would actually break a HUGE number of users, since the default lower
bound is 1. I have seen any number of pieces if code that rely on that.
Uh, yeah, we should make it 1 then, not 0, then. As long as the bound
is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
be lossless.

best regards,
Florian Pflug


Zero as the default lower bound is consistent with most languages (especially the common ones like C, C++, Java, & Python), in fact I don't remember any language where that is not the case (ignoring SQL) - and I've written programs in about 20 languages.

pascal, ADA, and ALGOL like languages

Regards

Pavel

ALOGOL 60 was zero based by default, as I remember deliberately setting the lower bound to 1, I managed to avoid PASCAL and I only glanced at ADA.


Cheers,
Gavin



2013/4/3 Gavin Flower <GavinFlower@archidevsys.co.nz>
On 04/04/13 04:58, Pavel Stehule wrote:



2013/4/3 Gavin Flower <GavinFlower@archidevsys.co.nz>
On 04/04/13 03:02, Florian Pflug wrote:
On Apr3, 2013, at 15:30 , Andrew Dunstan <andrew@dunslane.net> wrote:
On 04/02/2013 02:46 PM, Florian Pflug wrote:
If we're going to break compatibility, we should IMHO get rid of
non-zero lower bounds all together. My guess is that the number of
affected users wouldn't be much higher than for the proposed patch,
and it'd allow lossless mapping to most language's native array types…
That would actually break a HUGE number of users, since the default lower
bound is 1. I have seen any number of pieces if code that rely on that.
Uh, yeah, we should make it 1 then, not 0, then. As long as the bound
is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
be lossless.

best regards,
Florian Pflug


Zero as the default lower bound is consistent with most languages (especially the common ones like C, C++, Java, & Python), in fact I don't remember any language where that is not the case (ignoring SQL) - and I've written programs in about 20 languages.

pascal, ADA, and ALGOL like languages

Regards

Pavel

ALOGOL 60 was zero based by default, as I remember deliberately setting the lower bound to 1, I managed to avoid PASCAL and I only glanced at ADA.
Regards

Pavel
 


Cheers,
Gavin



Pavel

ALOGOL 60 was zero based by default, as I remember deliberately setting the lower bound to 1, I managed to avoid PASCAL and I only glanced at ADA.

In Pascal and similar languages (Wirth family) is possible to define lower bound - any value (like PL/pgSQL). Depends on a school - but lot of people used 1.

I remember - it was one argument for in Pascal/C flame war.

Regards

Pavel

 
Regards

 
Pavel
 


Cheers,
Gavin


>> Zero as the default lower bound is consistent with most languages
>> (especially the common ones like C, C++, Java, & Python), in fact
>> I don't remember any language where that is not the case (ignoring
>> SQL) - and I've written programs in about 20 languages.

Fortran ... Basic ... actually I'd have thought that zero was a
minority position.  Fashions change I guess.
        regards, tom lane



On Apr 3, 2013, at 9:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Fortran ... Basic ... actually I'd have thought that zero was a
> minority position.  Fashions change I guess.

I say we turn the default lower bound up to 11.

David





On 04/03/2013 11:34 AM, Gavin Flower wrote:
>
> Zero as the default lower bound is consistent with most languages 
> (especially the common ones like C, C++, Java, & Python), in fact I 
> don't remember any language where that is not the case (ignoring SQL) 
> - and I've written programs in about 20 languages.

Pascal and Ada are obvious examples.

I'm quite sure there are others, but don't have time to do the research.

cheers

andrew




On Wed, Apr 3, 2013 at 12:36 PM, David E. Wheeler <david@kineticode.com> wrote:
> On Apr 3, 2013, at 9:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Fortran ... Basic ... actually I'd have thought that zero was a
>> minority position.  Fashions change I guess.
>
> I say we turn the default lower bound up to 11.

+1.  I think that's clearly a compromise position that will make
everyone equally happy.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On 3 April 2013 15:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think though that the upthread argument that we'd have multiple
> interpretations of the same thing is bogus.  To me, the core idea that's
> being suggested here is that '{}' should mean a zero-length 1-D array,
> not a zero-D array as formerly.  We would still need a way to represent
> zero-D arrays, if only because they'd still exist on-disk in existing
> databases (assuming we're not willing to break pg_upgrade for this).
> I suggest that that ought *not* involve any braces.  Perhaps '[]=' would
> be a suitable representation.

Then restoring an old SQL dump would silently convert all old zero-D
arrays to zero-length 1-D arrays. Also, if you failed to completely
rewrite all your application code, and accidentally missed a few
functions, you could very easily end up in a situation where your
tables contained a mixture of  zero-D arrays and zero-length 1-D
arrays, that wouldn't compare equally, leading to all sorts of nasty
bugs.

> In the other direction, ISTM that
> '{{},{},{}}' is a zero-by-three array,

Isn't that a three-by-zero array rather than a zero-by-three array?
Either way, what is the use-case for it? I certainly can't remember
ever using a 0x3 or 3x0 matrix.

> entirely distinct from '{}' or
> '{{}}' in dimensionality if not content.

And '{{}}' is a one-by-zero array. There is currently no proposed
syntax for representing a zero-by-zero array, or a zero-by-one array.

All I see here is endless opportunity for confusion and code breakage,
with very little real benefit.

I actually don't think the current behaviour is broken. I find it
perfectly logical and consistent to have a single dimensionless
concept of an empty array. It has no elements, so questions like are
they arranged in 1-D or 2-D, and what index do they start at are
meaningless, and so should return NULL. The only counter-intuitive
thing is that array_length() returns NULL instead of 0 for the empty
array, and it doesn't take long to get used to that behaviour.

Much of this seems to have grown out of a desire to fix the confusing
behaviour of array_length(), but seems to have ended up in a much more
confusing place.

Given that array_length() takes 2 arguments, it is somewhat forgivable
for it to return NULL when you ask for the array's length in a
dimension that it doesn't have. Perhaps what might make the API easier
to work with is a new single-argument function --- e.g., array_size()
that would return the number of elements in the array --- 0 for '{}',
3 for '{1,2,3}', 6 for '{{1,2},{3,4},{5,6}}' and so on.

Regards,
Dean



Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Fortran ... Basic ... actually I'd have thought that zero was a
> minority position.  Fashions change I guess.

When I started programming the top four languages for business
programming were COBOL, BASIC, RPG II, and assembly language.
(Pascal and C came later, and I never saw much use of Fortran by
anyone other than mathematicians.) Except for assembly language,
the subscripts for arrays either started with 1 always, or that was
the default.  Given when it was first developed, it's not too
surprising that the SQL standard adopted 1 as the first element of
an array.

Which is more natural depends on whether you think of the subscript
in terms of ordinal positions or offsets from the base address.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On 04/04/13 05:21, Pavel Stehule wrote:


Pavel

ALOGOL 60 was zero based by default, as I remember deliberately setting the lower bound to 1, I managed to avoid PASCAL and I only glanced at ADA.

In Pascal and similar languages (Wirth family) is possible to define lower bound - any value (like PL/pgSQL). Depends on a school - but lot of people used 1.

I remember - it was one argument for in Pascal/C flame war.

Regards

Pavel

[...]

At the time (just over 40 years ago!) I was adamant that arrays should start with an index of one, now I much prefer zero.  Probably I prefer zero in part, because now I understand what is happening at the machine code level, and partly because zero is the default for the main languages I use.

Wasting time on Google (I have 'more' important things I 'should' be doing!), I find ALGOL 60 did not appear to have a default value for the lower index - not only that, but one could make it negative!  see: http://www.csci.csusb.edu/dick/samples/algol60.syntax.html

Anyhow, I think we should standardise on zero as the initial index to be as consistent as practicable.  However, not with a religious zeal at the expense of practical considerations!


Cheers,
Gavin
On 2013-04-04 07:42:36 +1300, Gavin Flower wrote:
> On 04/04/13 05:21, Pavel Stehule wrote:
> >
> >>
> >>        Pavel
> >>
> >        ALOGOL 60 was zero based by default, as I remember
> >        deliberately setting the lower bound to 1, I managed to avoid
> >        PASCAL and I only glanced at ADA.
> >
> >
> >    http://en.wikipedia.org/wiki/Comparison_of_programming_languages_%28array%29
> >
> >
> >In Pascal and similar languages (Wirth family) is possible to define lower
> >bound - any value (like PL/pgSQL). Depends on a school - but lot of people
> >used 1.
> >
> >I remember - it was one argument for in Pascal/C flame war.
> >
> >Regards
> >
> >Pavel
> >
> [...]
> 
> At the time (just over 40 years ago!) I was adamant that arrays should start
> with an index of one, now I much prefer zero.  Probably I prefer zero in
> part, because now I understand what is happening at the machine code level,
> and partly because zero is the default for the main languages I use.
> 
> Wasting time on Google (I have 'more' important things I 'should' be
> doing!), I find ALGOL 60 did not appear to have a default value for the
> lower index - not only that, but one could make it negative! see:
> http://www.csci.csusb.edu/dick/samples/algol60.syntax.html
> 
> Anyhow, I think we should standardise on zero as the initial index to be as
> consistent as practicable.  However, not with a religious zeal at the
> expense of practical considerations!

Changing this now, rather than on a green field, strikes me as a pretty
absurd exercise in frustrating users by breaking queries subtly. Since 1
is a value index that won't always break visible.

-1

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



<div class="moz-cite-prefix">On 04/04/13 05:30, Tom Lane wrote:<br /></div><blockquote
cite="mid:16677.1365006659@sss.pgh.pa.us"type="cite"><blockquote type="cite"><blockquote type="cite"><pre wrap="">Zero
asthe default lower bound is consistent with most languages
 
(especially the common ones like C, C++, Java, & Python), in fact
I don't remember any language where that is not the case (ignoring
SQL) - and I've written programs in about 20 languages.
</pre></blockquote></blockquote><pre wrap="">
Fortran ... Basic ... actually I'd have thought that zero was a
minority position.  Fashions change I guess.
        regards, tom lane
</pre></blockquote><font size="-1">I had forgotten the indexing in BASIC & FO<font size="-1">RTRAN</font>, <font
size="-1">Inow recall COBOL's TABL<font size="-1">E's were equivalent to arrays and they started at on<font
size="-1">e.<br/><br /><br /><font size="-1">Cheers,<br /><font size="-1">Gavin</font><br
/></font></font></font></font></font>
Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:

> Anyhow, I think we should standardise on zero as the initial
> index to be as consistent as practicable.

If you want to suggest a default of zero for the first subscript of
an array in SQL, please don't confuse the issue by using any form
of the word "standard" in that proposal.  There are ANSI and ISO
standards for SQL, and they require that the first element of an
array is one.  I'm OK with *extending* the standard by *allowing*
other values, but let's not flaunt the standard and break existing
code by changing the *default*.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



<div class="moz-cite-prefix">On 04/04/13 05:36, David E. Wheeler wrote:<br /></div><blockquote
cite="mid:EE62F5DD-16EB-46C7-AD72-9743619E7650@kineticode.com"type="cite"><pre wrap="">On Apr 3, 2013, at 9:30 AM, Tom
Lane<a class="moz-txt-link-rfc2396E" href="mailto:tgl@sss.pgh.pa.us"><tgl@sss.pgh.pa.us></a> wrote:
 

</pre><blockquote type="cite"><pre wrap="">Fortran ... Basic ... actually I'd have thought that zero was a
minority position.  Fashions change I guess.
</pre></blockquote><pre wrap="">
I say we turn the default lower bound up to 11.

David


</pre></blockquote><font size="-1">In keeping with the level of irrationality in this thread, ma<font size="-1">ybe we
shouldset it to an irrational number like the square root of 2, or transcend our selves and make in </font></font>a
transcendentalnumber like pi!  :-)<br /><br /> I suppose using the square root of minus one would be consider too
imaginative??? :-)<br /><br /><br /> Cheers,<br /> Gavin<br /> 
On 04/04/13 07:58, Kevin Grittner wrote:
> Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:
>
>> Anyhow, I think we should standardise on zero as the initial
>> index to be as consistent as practicable.
> If you want to suggest a default of zero for the first subscript of
> an array in SQL, please don't confuse the issue by using any form
> of the word "standard" in that proposal.  There are ANSI and ISO
> standards for SQL, and they require that the first element of an
> array is one.  I'm OK with *extending* the standard by *allowing*
> other values, but let's not flaunt the standard and break existing
> code by changing the *default*.
>
> --
> Kevin Grittner
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
You omitted my rider 'However, not with a religious zeal at the expense 
of practical considerations!' Which was meant to cover concerns like yours.


Cheers,
Gavin



On 2013-04-04 08:03:03 +1300, Gavin Flower wrote:
> On 04/04/13 07:58, Kevin Grittner wrote:
> >Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:
> >
> >>Anyhow, I think we should standardise on zero as the initial
> >>index to be as consistent as practicable.
> >If you want to suggest a default of zero for the first subscript of
> >an array in SQL, please don't confuse the issue by using any form
> >of the word "standard" in that proposal.  There are ANSI and ISO
> >standards for SQL, and they require that the first element of an
> >array is one.  I'm OK with *extending* the standard by *allowing*
> >other values, but let's not flaunt the standard and break existing
> >code by changing the *default*.

That's already possible:

postgres=# SELECT ('[0:3]={e1,e2,e3,e4}'::text[])[0];text 
------e1
(1 row)

Not too convenient, but ...

> You omitted my rider 'However, not with a religious zeal at the expense of
> practical considerations!' Which was meant to cover concerns like yours.

Given its already possible I don't understand what you propose then. A
guc that allows changing the default? That would need to be included in
dumps and such and would make old dumps - which won't include an
explicit SET of the current value - ambigious to load. So that doesn't
seem to be a good idea either.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



On 4/3/13 10:34 AM, Gavin Flower wrote:
> Maybe we should adopt the famous compromise of '0.5'?

+0.5. ;P
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



<div class="moz-cite-prefix">On 04/04/13 11:55, Jim Nasby wrote:<br /></div><blockquote
cite="mid:515CB368.4020003@nasby.net"type="cite">On 4/3/13 10:34 AM, Gavin Flower wrote: <br /><blockquote
type="cite">Maybewe should adopt the famous compromise of '0.5'? <br /></blockquote><br /> +0.5. ;P <br
/></blockquote><fontsize="-1">Far too positive for our grim world!  How about '-0,5' instead?  :-)</font><br /><br /> I
noticeyou call yourself a 'Data Architect' - never too sure If I should call myself <br /> a 'software architect who
pretendshe can program' or <br /> a 'programmer with pretensions of being a software architect'?  :-)<br /> 
On 4 April 2013 01:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think though that the upthread argument that we'd have multiple
> interpretations of the same thing is bogus.  To me, the core idea that's
> being suggested here is that '{}' should mean a zero-length 1-D array,
> not a zero-D array as formerly.  We would still need a way to represent
> zero-D arrays, if only because they'd still exist on-disk in existing
> databases (assuming we're not willing to break pg_upgrade for this).

Tom,

My thought was that on-disk zero-D arrays should be converted into
empty 1-D arrays (with default lower bounds of course) when they are
read by array_recv.  Any SQL operation on your zero-D arrays would
therefore resolve as though they were 1-D.  A pg_dump/restore would
result in the arrays being 1-D on the restore side.  If pg_upgrade
conserves the zero-D array in binary form, that's okay since the
receiving end will just treat it as 1-D out of array_recv anyway.

My intention was that the zero-D array could continue to live
indefinitely in binary form, but would never be observable as such by
any application code.

Cheers,
BJ



Brendan Jurd <direvus@gmail.com> writes:
> My thought was that on-disk zero-D arrays should be converted into
> empty 1-D arrays (with default lower bounds of course) when they are
> read by array_recv.

Huh?  array_recv would not get applied to datums coming off of disk.
The only way to make this 100% transparent would be to go through every
C-coded function that deals with arrays and make sure it delivers
identical results for both cases.  It's possible we could do that for
array functions in the core code, but what about user-written
extensions?

In any case, the whole exercise is pointless if we don't change the
visible behavior of array_dims et al.  So I think the idea that this
would be without visible consequence is silly.  What's up for argument
is just how much incompatibility is acceptable.
        regards, tom lane



On 4 April 2013 15:11, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Brendan Jurd <direvus@gmail.com> writes:
>> My thought was that on-disk zero-D arrays should be converted into
>> empty 1-D arrays (with default lower bounds of course) when they are
>> read by array_recv.
>
> Huh?  array_recv would not get applied to datums coming off of disk.

My mistake, sorry for the noise.

> In any case, the whole exercise is pointless if we don't change the
> visible behavior of array_dims et al.  So I think the idea that this
> would be without visible consequence is silly.  What's up for argument
> is just how much incompatibility is acceptable.

I don't know that anyone was suggesting there would be no visible
consequences of any kind.  I was hoping that we could at least
represent on-disk zero-D arrays as though they were 1-D.

If that's not going to fly, and we are stuck with continuing to allow
zero-D as a valid representation, then perhaps your '[]=' syntax would
be the way to proceed.  It would not be terribly difficult to rework
the patch along those lines, although I have to admit "allow empty
arrays with dimensions" is not nearly so satisfying a title as
"exorcise zero-dimensional arrays".

Cheers,
BJ



<div class="moz-cite-prefix">2013-04-03 20:58 keltezéssel, Gavin Flower írta:<br /></div><blockquote
cite="mid:515C7BC1.5050507@archidevsys.co.nz"type="cite"><div class="moz-cite-prefix">On 04/04/13 05:36, David E.
Wheelerwrote:<br /></div><blockquote cite="mid:EE62F5DD-16EB-46C7-AD72-9743619E7650@kineticode.com" type="cite"><pre
wrap="">OnApr 3, 2013, at 9:30 AM, Tom Lane <a class="moz-txt-link-rfc2396E" href="mailto:tgl@sss.pgh.pa.us"
moz-do-not-send="true"><tgl@sss.pgh.pa.us></a>wrote:
 

</pre><blockquote type="cite"><pre wrap="">Fortran ... Basic ... actually I'd have thought that zero was a
minority position.  Fashions change I guess.
</pre></blockquote><pre wrap="">I say we turn the default lower bound up to 11.

David


</pre></blockquote><font size="-1">In keeping with the level of irrationality in this thread, ma<font size="-1">ybe we
shouldset it to an irrational number like the square root of 2, or transcend our selves and make in </font></font>a
transcendentalnumber like pi!  :-)<br /><br /> I suppose using the square root of minus one would be consider too
imaginative??? :-)<br /></blockquote><br /> Nah, that would make arrays have 2 dimensions as a minimum... :-)<br /><br
/><blockquotecite="mid:515C7BC1.5050507@archidevsys.co.nz" type="cite"><br /><br /> Cheers,<br /> Gavin<br
/></blockquote><br/><br /><pre class="moz-signature" cols="90">-- 
 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a>
<aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a>
 
</pre>
On Wed, Apr 3, 2013 at 11:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> In any case, the whole exercise is pointless if we don't change the
> visible behavior of array_dims et al.  So I think the idea that this
> would be without visible consequence is silly.  What's up for argument
> is just how much incompatibility is acceptable.

The only reasonable answer for this (a provably used, non-security,
non-standards violating, non-gross functionality breakage case) is
*zero*.  Our historically cavalier attitude towards compatibility
breakage has been an immense disservice to our users and encourages
very bad upgrade habits and is, IMNSHO, embarrassing.

Changing the way array_dims works for a minor functionality
enhancement is gratuitous and should be done, if at all, via a loudly
advertised deprecation/replacement cycle with a guarding GUC (yes, I
hate them too, but not nearly as much as the expense of qualifying
vast code bases against random compatibility breakages every release).

merlin



On Thu, Apr 4, 2013 at 11:10 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> The only reasonable answer for this (a provably used, non-security,
> non-standards violating, non-gross functionality breakage case) is
> *zero*.

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Apr 4, 2013 at 11:10 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> The only reasonable answer for this (a provably used, non-security,
>> non-standards violating, non-gross functionality breakage case) is
>> *zero*.

> +1.

Well, if we're going to take that hard a line on it, then we can't
change anything about array data storage or the existing functions'
behavior; which leaves us with either doing nothing at all, or
inventing new functions that have saner behavior while leaving the
old ones in place.
        regards, tom lane



On 5 April 2013 07:43, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Well, if we're going to take that hard a line on it, then we can't
> change anything about array data storage or the existing functions'
> behavior; which leaves us with either doing nothing at all, or
> inventing new functions that have saner behavior while leaving the
> old ones in place.

And then we are in the awkward position of trying to explain the
differences in behaviour between the old and new functions ...
presumably with a dash of "for historical reasons" and a sprinkling of
"to preserve compatibility" in every other paragraph.

The other suggestion that had been tossed around elsewhere upthread
was inventing a new type that serves the demand for a straightforward
mutable list, which has exactly one dimension, and which may be
sensibly empty.  Those few who are interested in dimensions >= 2 could
keep on using "arrays", with all their backwards-compatible silliness
intact, and everybody else could migrate to "lists" at their leisure.

I don't hate the latter idea from a user perspective, but from a
developer perspective I suspect there are valid objections to be made.

Cheers,
BJ



Brendan Jurd <direvus@gmail.com> writes:
> The other suggestion that had been tossed around elsewhere upthread
> was inventing a new type that serves the demand for a straightforward
> mutable list, which has exactly one dimension, and which may be
> sensibly empty.  Those few who are interested in dimensions >= 2 could
> keep on using "arrays", with all their backwards-compatible silliness
> intact, and everybody else could migrate to "lists" at their leisure.

> I don't hate the latter idea from a user perspective, but from a
> developer perspective I suspect there are valid objections to be made.

The real problem with that is that the existing arrays have glommed onto
the syntax that is both most natural and SQL-spec-required.  I don't
think there is a lot of room to shove in a different kind of critter
there.  (There's been a remarkable lack of attention to the question
of spec compliance in this thread, btw.  Surely the standard has
something to say on the matter of zero-length arrays?)
        regards, tom lane



On 5 April 2013 13:04, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> (There's been a remarkable lack of attention to the question
> of spec compliance in this thread, btw.  Surely the standard has
> something to say on the matter of zero-length arrays?)

From 4.10 in my draft copy of "Foundation", arrays are one of two
"collection" types (the other being multisets), and:
 "A collection is a composite value comprising zero or more elements,
each a value of some data type DT"
 "The number of elements in C is the cardinality of C"
 "An array is a collection A in which each element is associated with
exactly one ordinal position in A. If n is
the cardinality of A, then the ordinal position p of an element is an
integer in the range 1 (one) ≤ p ≤ n."

The language specifically allows for zero elements, and does not
contemplate multiple dimensions.  The specification for the array
constructor syntax (6.36) and array element reference by subscript
(6.23) also make it fairly clear that only 1-D arrays were being
considered.

I'd say we've already gone way off-menu by having multidims.  A more
compliant approach would have been to implement arrays as 1-D only,
and then maybe have a separate thing ("matrices"?) for multidims.

While I was in there I noticed CARDINALITY, which would be pretty easy
to add and would at least provide a more productive way to get the
"real" length of an array without disrupting existing functionality:

"<cardinality expression> ::=
CARDINALITY <left paren> <collection value expression> <right paren>"

Cheers,
BJ



Brendan Jurd <direvus@gmail.com> writes:
> On 5 April 2013 13:04, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> (There's been a remarkable lack of attention to the question
>> of spec compliance in this thread, btw.  Surely the standard has
>> something to say on the matter of zero-length arrays?)

> The language specifically allows for zero elements, and does not
> contemplate multiple dimensions.  The specification for the array
> constructor syntax (6.36) and array element reference by subscript
> (6.23) also make it fairly clear that only 1-D arrays were being
> considered.

> I'd say we've already gone way off-menu by having multidims.

Yeah, we knew that.  I don't have a problem with seeing multidim arrays
as an extension to the standard though.  The point is that the behavior
for the 1-D, default-lower-bound case ought to match the standard.

> While I was in there I noticed CARDINALITY, which would be pretty easy
> to add and would at least provide a more productive way to get the
> "real" length of an array without disrupting existing functionality:

Yeah, that would at least fix the null-result-for-empty-array problem
for that particular functionality.  Still, this is ammunition for the
position that null results for empty arrays are just broken.

BTW ... if you check the archives you will find that we had
cardinality() for a short while, and removed it before 8.4 release,
because we couldn't agree on what it ought to return when given a
multi-dimensional array.  I'm afraid that issue is still unresolved.
        regards, tom lane



On 5 April 2013 15:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Brendan Jurd <direvus@gmail.com> writes:
>> While I was in there I noticed CARDINALITY, which would be pretty easy
>> to add and would at least provide a more productive way to get the
>> "real" length of an array without disrupting existing functionality:
>
> Yeah, that would at least fix the null-result-for-empty-array problem
> for that particular functionality.  Still, this is ammunition for the
> position that null results for empty arrays are just broken.
>
> BTW ... if you check the archives you will find that we had
> cardinality() for a short while, and removed it before 8.4 release,
> because we couldn't agree on what it ought to return when given a
> multi-dimensional array.  I'm afraid that issue is still unresolved.

Well for what it's worth I would expect cardinality() to return the
total number of elements in the array (per ArrayGetNItems).  It's
consistent with the spec's identification of an array as a
"collection".  You can chunk the elements into dimensions however you
want, but it's still a collection of elements, and the cardinality is
still the number of elements.

The "nesting" interpretation doesn't accord with our internal
representation, nor with our requirement that multidim arrays be
regular, nor with the fact that we can't put an array of texts inside
an array of ints.  Our array input syntaxes for multidim arrays look
nest-ish but what they produce is not nested.

Cheers,
BJ



Re: [PATCH] Exorcise "zero-dimensional" arrays

From
Dimitri Fontaine
Date:
Brendan Jurd <direvus@gmail.com> writes:
> Well for what it's worth I would expect cardinality() to return the
> total number of elements in the array (per ArrayGetNItems).  It's
> consistent with the spec's identification of an array as a
> "collection".  You can chunk the elements into dimensions however you
> want, but it's still a collection of elements, and the cardinality is
> still the number of elements.

+1

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Brendan Jurd <direvus@gmail.com> wrote:

> The language specifically allows for zero elements, and does not
> contemplate multiple dimensions.

I don't remember anything in the spec which would prohibit the data
type of an array element from itself being an array, however.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On 6 April 2013 01:59, Kevin Grittner <kgrittn@ymail.com> wrote:
> Brendan Jurd <direvus@gmail.com> wrote:
>
>> The language specifically allows for zero elements, and does not
>> contemplate multiple dimensions.
>
> I don't remember anything in the spec which would prohibit the data
> type of an array element from itself being an array, however.

Indeed it does not prohibit nesting arrays inside other arrays, but
the multidim arrays that Postgres allows you to create are not the
same thing as nested arrays.

I believe that a purely to-spec implementation would allow you to make
an array-of-int-arrays, but since each element is its own separate
collection there would be no requirement that they have the same
cardinality as each other.

For example, ARRAY[[1], [2,3], [4,5,6]] is a valid collection per the
spec, but Postgres won't let you create this, because Postgres is
trying to create a 2-D matrix of integers, rather than a collection of
collections of integers.

The inability to extend multidim arrays in Postgres is another
manifestation of this matrix-oriented design.  Extending nested
collections is a no-brainer.  Extending matrices while ensuring they
remain perfectly regular ... not so much.

Cheers,
BJ



Brendan Jurd <direvus@gmail.com> wrote:
> On 6 April 2013 01:59, Kevin Grittner <kgrittn@ymail.com> wrote:
>> Brendan Jurd <direvus@gmail.com> wrote:
>>
>>> The language specifically allows for zero elements, and does not
>>> contemplate multiple dimensions.
>>
>> I don't remember anything in the spec which would prohibit the data
>> type of an array element from itself being an array, however.
>
> Indeed it does not prohibit nesting arrays inside other arrays, but
> the multidim arrays that Postgres allows you to create are not the
> same thing as nested arrays.
>
> I believe that a purely to-spec implementation would allow you to make
> an array-of-int-arrays, but since each element is its own separate
> collection there would be no requirement that they have the same
> cardinality as each other.
>
> For example, ARRAY[[1], [2,3], [4,5,6]] is a valid collection per the
> spec, but Postgres won't let you create this, because Postgres is
> trying to create a 2-D matrix of integers, rather than a collection of
> collections of integers.
>
> The inability to extend multidim arrays in Postgres is another
> manifestation of this matrix-oriented design.  Extending nested
> collections is a no-brainer.  Extending matrices while ensuring they
> remain perfectly regular ... not so much.

Your interpretation matches mine all around.  It is unfortunate
that we have hijacked the standard's syntax for arrays to add a
matrix feature.  It seems to leave us without any way forward on
these issues that I like very much.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On 7 April 2013 01:43, Kevin Grittner <kgrittn@ymail.com> wrote:
> Brendan Jurd <direvus@gmail.com> wrote:
>> Indeed it does not prohibit nesting arrays inside other arrays, but
>> the multidim arrays that Postgres allows you to create are not the
>> same thing as nested arrays.
>>
> Your interpretation matches mine all around.  It is unfortunate
> that we have hijacked the standard's syntax for arrays to add a
> matrix feature.

It really is unfortunate.  I wonder if it was done in an attempt to
mimic Oracle behaviour.

> It seems to leave us without any way forward on
> these issues that I like very much.

On the specific issue of CARDINALITY, I guess we need to decide
whether we are going to pretend that our array/matrix thing is
actually nested.  I first argued that we should not.   But it occurred
to me that if we do pretend, it would at least leave the door ajar if
we want to do something to make our arrays more nest-like in future,
without disrupting the behaviour of CARDINALITY.

It is unlikely that we ever would make such a change, but given the
intensity of the opposition to any kind of SQL-level behavioural
changes we've had on this thread, I don't want to create any more
barriers to future efforts to comport with the spec.

So how about:

* we add CARDINALITY, make it work like array_length(a, 1) except that
it returns zero for empty arrays, and
* we add array_num_items, which exposes the internal ArrayGetNItems,
and returns zero for empty arrays.

As in:

CARDINALITY(ARRAY[[1,2], [3,4], [5,6]]) => 3
array_num_items(ARRAY[[1,2], [3,4], [5,6]]) => 6

Cheers,
BJ



Brendan Jurd <direvus@gmail.com> writes:
> On 7 April 2013 01:43, Kevin Grittner <kgrittn@ymail.com> wrote:
>> Your interpretation matches mine all around.  It is unfortunate
>> that we have hijacked the standard's syntax for arrays to add a
>> matrix feature.

> It really is unfortunate.  I wonder if it was done in an attempt to
> mimic Oracle behaviour.

Hardly likely.  That code goes back to Berkeley days (PostQUEL) ---
there is clear ancestry from the array code in Postgres v4r2 released
June 1994.  It's more or less a coincidence that it matches the SQL spec
at all, and I'd be astonished if it matched Oracle particularly closely.

> On the specific issue of CARDINALITY, I guess we need to decide
> whether we are going to pretend that our array/matrix thing is
> actually nested.  I first argued that we should not.   But it occurred
> to me that if we do pretend, it would at least leave the door ajar if
> we want to do something to make our arrays more nest-like in future,
> without disrupting the behaviour of CARDINALITY.

This seems to be exactly the same uncertainty that we couldn't resolve
back in the 8.4 devel cycle, for exactly the same reasons.  I don't see
that the discussion has moved forward any :-(
        regards, tom lane



On 8 April 2013 16:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Brendan Jurd <direvus@gmail.com> writes:
>> On the specific issue of CARDINALITY, I guess we need to decide
>> whether we are going to pretend that our array/matrix thing is
>> actually nested.  I first argued that we should not.   But it occurred
>> to me that if we do pretend, it would at least leave the door ajar if
>> we want to do something to make our arrays more nest-like in future,
>> without disrupting the behaviour of CARDINALITY.
>
> This seems to be exactly the same uncertainty that we couldn't resolve
> back in the 8.4 devel cycle, for exactly the same reasons.  I don't see
> that the discussion has moved forward any :-(
>

I had a poke around in the archives, and it seems to me that the major
argument that was advanced in favour of making cardinality() return
the total number of items was ... we don't have anything that does
that yet.  That's why I'm proposing we add array_num_items as well --
I do think there should be a function for this, I just don't think
cardinality fits the bill.

Cheers,
BJ




On Thu, Apr 4, 2013 at 4:10 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
The only reasonable answer for this (a provably used, non-security,
non-standards violating, non-gross functionality breakage case) is
*zero*.  Our historically cavalier attitude towards compatibility
breakage has been an immense disservice to our users and encourages
very bad upgrade habits and is, IMNSHO, embarrassing.

The flip side of this is that the ability to make improvements freely is one of the biggest strengths of free software over commercial software. Oracle and Microsoft invest *tons* of money in maintaining huge libraries of backward compatibility code and drastically limits their ability to keep making improvements.


--
greg
All,

As much as I have a keen interest in this feature, it isn't (AFAIK)
being considered for 9.3.  Given that it's generated a fair amount of
controversy, could we table it until 9.3 beta?  There's still plenty of
unresolved 9.3 patches in the queue.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



On Mon, Apr  8, 2013 at 04:24:31PM -0700, Josh Berkus wrote:
> All,
> 
> As much as I have a keen interest in this feature, it isn't (AFAIK)
> being considered for 9.3.  Given that it's generated a fair amount of
> controversy, could we table it until 9.3 beta?  There's still plenty of
> unresolved 9.3 patches in the queue.

Can someone add a TODO item or a link to the thread?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



On 9 April 2013 09:24, Josh Berkus <josh@agliodbs.com> wrote:
> As much as I have a keen interest in this feature, it isn't (AFAIK)
> being considered for 9.3.  Given that it's generated a fair amount of
> controversy, could we table it until 9.3 beta?  There's still plenty of
> unresolved 9.3 patches in the queue.

No problem.  I certainly wasn't expecting it to run for 90 messages
when I started out.  I'll pipe down for now and resume after the beta.

Cheers,
BJ



On 04/08/2013 07:16 PM, Brendan Jurd wrote:
> On 9 April 2013 09:24, Josh Berkus <josh@agliodbs.com> wrote:
>> As much as I have a keen interest in this feature, it isn't (AFAIK)
>> being considered for 9.3.  Given that it's generated a fair amount of
>> controversy, could we table it until 9.3 beta?  There's still plenty of
>> unresolved 9.3 patches in the queue.
> 
> No problem.  I certainly wasn't expecting it to run for 90 messages
> when I started out.  I'll pipe down for now and resume after the beta.

What's the status on this patch and current approach to ZDA?


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



On 12 June 2013 04:43, Josh Berkus <josh@agliodbs.com> wrote:
> What's the status on this patch and current approach to ZDA?

Alright, it might be a good idea to have a quick recap.

Last time, on Arrays Of Our Lives ...

So I proposed and posted a patch aimed at deprecating zero-D arrays,
and replacing them with 1-D empty arrays.

That kicked off a long discussion, with some folks (Robert, Merlin,
Florian et al) vigorously opposing any change in the behaviour of the
array_(length|ndims|lower|upper) functions.  To those folks, the
behaviour would have to be a lot worse than it currently is to justify
breaking compatibility with existing applications.

The idea of using a GUC to smooth over the compatibility break was
suggested, and firmly rejected.

The idea of creating an entirely new type with nicer behaviours came
up, but that wouldn't really fly because arrays have already hogged
all the best syntax.

Since compatibility breakage is so contentious, I suggested that we
forget about changing the array representation and just add new
functions with more sensible behaviours:

* cardinality(anyarray) to return the length of the first dimension,
zero if empty, and
* array_num_items(anyarray) to return the total number of element
positions per ArrayGetNItems, zero if empty.

There have been attempts to add a cardinality function in the past, as
it is required by the SQL spec, but these attempts have stalled when
trying to decide how it should handle multidim arrays.  Having it
return the length of the first dimension is the more spec-compatible
way to go, but some folks argued that it should work as
ArrayGetNItems, because we don't already have a function for that at
the SQL level.  Therefore I propose we add cardinality() per the spec,
and another function to expose ArrayGetNItems.

And that's about where we got to, when the whole discussion was put on
a time-out to make room for the beta.

I am withdrawing the original zero-D patch in favour of the proposed
new functions.  If you have an opinion about that, please do chime in.Depending on how that goes I may post a patch
implementingmy new
 
proposal in the next few days.

Cheers,
BJ



On Jun 11, 2013, at 3:09 PM, Brendan Jurd <direvus@gmail.com> wrote:

> There have been attempts to add a cardinality function in the past, as
> it is required by the SQL spec, but these attempts have stalled when
> trying to decide how it should handle multidim arrays.  Having it
> return the length of the first dimension is the more spec-compatible
> way to go, but some folks argued that it should work as
> ArrayGetNItems, because we don't already have a function for that at
> the SQL level.  Therefore I propose we add cardinality() per the spec,
> and another function to expose ArrayGetNItems.
>
> And that's about where we got to, when the whole discussion was put on
> a time-out to make room for the beta.
>
> I am withdrawing the original zero-D patch in favour of the proposed
> new functions.  If you have an opinion about that, please do chime in.
> Depending on how that goes I may post a patch implementing my new
> proposal in the next few days.

+1 to this proposal. Modulo function names, perhaps. I don’t much care what they're called, as long as the work as you
describehere. 

Best,

David
Array Complainer




On 12 June 2013 04:50, David E. Wheeler <david@justatheory.com> wrote:
> On Jun 11, 2013, at 3:09 PM, Brendan Jurd <direvus@gmail.com> wrote:
>
>> There have been attempts to add a cardinality function in the past, as
>> it is required by the SQL spec, but these attempts have stalled when
>> trying to decide how it should handle multidim arrays.  Having it
>> return the length of the first dimension is the more spec-compatible
>> way to go, but some folks argued that it should work as
>> ArrayGetNItems, because we don't already have a function for that at
>> the SQL level.  Therefore I propose we add cardinality() per the spec,
>> and another function to expose ArrayGetNItems.
>>
>> And that's about where we got to, when the whole discussion was put on
>> a time-out to make room for the beta.
>>
>> I am withdrawing the original zero-D patch in favour of the proposed
>> new functions.  If you have an opinion about that, please do chime in.
>> Depending on how that goes I may post a patch implementing my new
>> proposal in the next few days.
>
> +1 to this proposal. Modulo function names, perhaps. I don’t much care what they're called, as long as the work as
youdescribe here. 
>

+1 for having a function to return the total number of elements in an
array, because that's something that's currently missing from SQL.

However, I think that CARDINALITY() should be that function.

I'm not convinced that having CARDINALITY() return the length of the
first dimension is more spec-compatible, since our multi-dimensional
arrays aren't nested arrays, and it seems unlikely that they ever will
be. I'd argue that it's at least equally spec-compatible to have
CARDINALITY() return the total number of elements in the array, if you
think of a multi-dimensional array as a collection of elements
arranged in a regular pattern.

Also, the spec describes CARDINALITY() and UNNEST() using the same
language, and I think it's implicit in a couple of places that
CARDINALITY() should match the number of rows returned by UNNEST(),
which we've already implemented as fully unnesting every element.

We're about to add ORDINALITY to UNNEST(), and to me it would be very
odd to have the resulting maximum ordinality exceed the array's
cardinality.

Regards,
Dean



On 12 June 2013 18:22, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> +1 for having a function to return the total number of elements in an
> array, because that's something that's currently missing from SQL.
>
> However, I think that CARDINALITY() should be that function.
>
> I'm not convinced that having CARDINALITY() return the length of the
> first dimension is more spec-compatible, since our multi-dimensional
> arrays aren't nested arrays, and it seems unlikely that they ever will
> be. I'd argue that it's at least equally spec-compatible to have
> CARDINALITY() return the total number of elements in the array, if you
> think of a multi-dimensional array as a collection of elements
> arranged in a regular pattern.

It's true that our multidims aren't nested, but they are the nearest
thing we have.  If we want to keep the door open for future attempts
to nudge multidim arrays into closer approximation of nested arrays,
it would be better to have the nested interpretation of CARDINALITY.
Given what we've just gone through with array_length, it seems that
once we select a behaviour for CARDINALITY, we will be stuck with it
permanently.

The problem with thinking of our multidim arrays as just a weirdly
crumpled arrangement of a single collection, is that we've already
abused the nesting syntax for declaring them.

> Also, the spec describes CARDINALITY() and UNNEST() using the same
> language, and I think it's implicit in a couple of places that
> CARDINALITY() should match the number of rows returned by UNNEST(),
> which we've already implemented as fully unnesting every element.
>
> We're about to add ORDINALITY to UNNEST(), and to me it would be very
> odd to have the resulting maximum ordinality exceed the array's
> cardinality.

Yeah, that makes sense.  Well the good news is that either way,
CARDINALITY will do what people want in the most common case where the
array is one-dimensional.

Multidim arrays are why we can't have nice things.

Cheers,
BJ



On Wed, Jun 12, 2013 at 5:05 AM, Brendan Jurd <direvus@gmail.com> wrote:
> On 12 June 2013 18:22, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>> +1 for having a function to return the total number of elements in an
>> array, because that's something that's currently missing from SQL.
>>
>> However, I think that CARDINALITY() should be that function.
>>
>> I'm not convinced that having CARDINALITY() return the length of the
>> first dimension is more spec-compatible, since our multi-dimensional
>> arrays aren't nested arrays, and it seems unlikely that they ever will
>> be. I'd argue that it's at least equally spec-compatible to have
>> CARDINALITY() return the total number of elements in the array, if you
>> think of a multi-dimensional array as a collection of elements
>> arranged in a regular pattern.
>
> It's true that our multidims aren't nested, but they are the nearest
> thing we have.  If we want to keep the door open for future attempts
> to nudge multidim arrays into closer approximation of nested arrays,
> it would be better to have the nested interpretation of CARDINALITY.

I think there's just about zero chance of something like that ever
happening.  The problem is that our type system just can't support it.A function or operator that takes an array needs
todeclare whether
 
it's going to return an array or whether it's going to return the base
type.  It can't decide to return one or the other at run-time
depending on the dimensionality of the array.

For this to really work, we'd need the number of dimensions to be
baked into the array type.  The obvious implementation would be to
have N array types per base type rather than 1, each with a different
number of dimensions.  Then a subscripting function which took a
1-dimensional array could return anyelement, and the similarly named
function which took a 2-dimensional array could return a 1-dimensional
array.

I believe the reason it wasn't done this way initially was because of
pg_type bloat; having 6 extra type definitions for every type we
support is unappealing.  We could force them to be explicitly declared
as we do for range types.  Or we could rewrite a whole lotta code to
understand a "type" as something more complex than "an OID from
pg_type", so that we don't need pre-defined entries in pg_type for
array types in the first place.

But none of these things are nudges.  Making any real improvement in
this area is going to take major surgery, not a nudge.

> Multidim arrays are why we can't have nice things.

Yeah, I think that was not our best moment.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



>> Multidim arrays are why we can't have nice things.
> 
> Yeah, I think that was not our best moment.  :-(

They were one of those hacks which looked really clever at the time, but
proved not to be so.  Unfortunately, they *are* useful, and are being
used; I use MD arrays all the time to push data in and out of PL/R., and
now that we have a JSON type I'm using them to generate arrays-of-arrays
using JSON conversion functions.  I'm sure many others are doing the
same, so there's no killing the feature.

Actually, if you think about it, we've never had "arrays" in PostgreSQL
... we have always had matrixes.  If you think about things that way,
most of the current functionality makes sense.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Josh Berkus <josh@agliodbs.com> writes:
> Actually, if you think about it, we've never had "arrays" in PostgreSQL
> ... we have always had matrixes.  If you think about things that way,
> most of the current functionality makes sense.

Arguably, the only thing wrong with multidim arrays is that they're not
what the SQL standard asks for.  However, the original point in this
thread was that we have some very bizarre corner-case behavior for empty
arrays.  I'm going to be disappointed if all we can get out of this is
a cardinality() function, and nothing is done about the empty-array
semantics.
        regards, tom lane



On 06/12/2013 11:01 AM, Tom Lane wrote:

> Arguably, the only thing wrong with multidim arrays is that they're not
> what the SQL standard asks for.  However, the original point in this
> thread was that we have some very bizarre corner-case behavior for empty
> arrays.  I'm going to be disappointed if all we can get out of this is
> a cardinality() function, and nothing is done about the empty-array
> semantics.

Well, we can't change the zero-dim behavior without breaking backwards
compatibility.  And enough people piled on to say NO to that, that it
went by the wayside.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Josh Berkus <josh@agliodbs.com> writes:
> On 06/12/2013 11:01 AM, Tom Lane wrote:
>> I'm going to be disappointed if all we can get out of this is
>> a cardinality() function, and nothing is done about the empty-array
>> semantics.

> Well, we can't change the zero-dim behavior without breaking backwards
> compatibility.  And enough people piled on to say NO to that, that it
> went by the wayside.

Meh.  Robert was pretty vocal about it, but it wasn't clear to me that
his was the majority opinion, and in any case there wasn't much
consideration given to compromises falling somewhere between "no
changes" and the rather drastic solution Brendan proposed.  For
instance, it's really hard to believe that this is a good thing:

regression=# select array_dims('{}'::int[]) is null;?column? 
----------t
(1 row)

Whatever you think the dimensions of that are, surely they're not
unknown.
        regards, tom lane



On Wed, Jun 12, 2013 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> On 06/12/2013 11:01 AM, Tom Lane wrote:
>>> I'm going to be disappointed if all we can get out of this is
>>> a cardinality() function, and nothing is done about the empty-array
>>> semantics.
>
>> Well, we can't change the zero-dim behavior without breaking backwards
>> compatibility.  And enough people piled on to say NO to that, that it
>> went by the wayside.
>
> Meh.  Robert was pretty vocal about it, but it wasn't clear to me that
> his was the majority opinion, and in any case there wasn't much
> consideration given to compromises falling somewhere between "no
> changes" and the rather drastic solution Brendan proposed.  For
> instance, it's really hard to believe that this is a good thing:
>
> regression=# select array_dims('{}'::int[]) is null;
>  ?column?
> ----------
>  t
> (1 row)
>
> Whatever you think the dimensions of that are, surely they're not
> unknown.

But, couldn't that be solved by deprecating that function and
providing a more sensible alternatively named version?

merlin



On 2013.06.12 10:11 AM, Josh Berkus wrote:
>>> Multidim arrays are why we can't have nice things.
>>
>> Yeah, I think that was not our best moment.  :-(
>
> Actually, if you think about it, we've never had "arrays" in PostgreSQL
> ... we have always had matrixes.  If you think about things that way,
> most of the current functionality makes sense.

If the type system were changed so that arrays were were always just 
one-dimensional, you can define your matrix simply as a binary relation type 
whose primary key attribute has the type of a fixed-length array of integers, 
where the number of elements in the array is the number of dimensions in the 
matrix, and the array elements themselves defined the coordinates in the matrix.  This design confers a number of
benefits. Also, the case of the 
 
zero-dimension matrix needs no special treatment; the key array has zero 
elements.  Would that not work? -- Darren Duncan




On 13 June 2013 04:26, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Jun 12, 2013 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Josh Berkus <josh@agliodbs.com> writes:
>>> On 06/12/2013 11:01 AM, Tom Lane wrote:
>>>> I'm going to be disappointed if all we can get out of this is
>>>> a cardinality() function, and nothing is done about the empty-array
>>>> semantics.

I would be disappointed too, but on the other hand, CARDINALITY is
required by the spec and anything would be better than nothing.

>> Meh.  Robert was pretty vocal about it, but it wasn't clear to me that
>> his was the majority opinion, and in any case there wasn't much
>> consideration given to compromises falling somewhere between "no
>> changes" and the rather drastic solution Brendan proposed.

I'm all for looking into possible compromises, and will happily take
any improvements to this mess I think I can get past the compatibility
maximalist caucus.

>> regression=# select array_dims('{}'::int[]) is null;
>>  ?column?
>> ----------
>>  t
>> (1 row)
>>
>> Whatever you think the dimensions of that are, surely they're not
>> unknown.

I don't think anyone has actually tried to defend the behaviour of the
array functions w.r.t. empty arrays.  Even the opponents of the
original proposal agreed that the behaviour was silly, they just
didn't want to fix it, on account of the upgrade burden.

> But, couldn't that be solved by deprecating that function and
> providing a more sensible alternatively named version?

And what would you name that function?  array_dims2?  I can't think of
a name that makes the difference in behaviour apparent.  Can you
imagine the documentation for that?

array_dims - Returns the dimensions of the array, unless it is empty
in which case NULL.
array_proper_dims - Returns the dimensions of the array.
array_ndims - Returns the number of dimension, unless it is empty in
which case NULL.
array_proper_ndims - Returns the number of dimensions.

... and so on for _length, _upper and _lower.

Cheers,
BJ



On Wed, Jun 12, 2013 at 10:00 PM, Brendan Jurd <direvus@gmail.com> wrote:
> On 13 June 2013 04:26, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Wed, Jun 12, 2013 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Josh Berkus <josh@agliodbs.com> writes:
>>>> On 06/12/2013 11:01 AM, Tom Lane wrote:
>>>>> I'm going to be disappointed if all we can get out of this is
>>>>> a cardinality() function, and nothing is done about the empty-array
>>>>> semantics.
>
> I would be disappointed too, but on the other hand, CARDINALITY is
> required by the spec and anything would be better than nothing.

certainly.

>> But, couldn't that be solved by deprecating that function and
>> providing a more sensible alternatively named version?
>
> And what would you name that function?  array_dims2?  I can't think of
> a name that makes the difference in behaviour apparent.  Can you
> imagine the documentation for that?

I don't know the answer to that, but I think it's hard to argue that
deprecating and documenting a few functions is a heavier burden on
your users than having to sift through older arcane code before
upgrading to the latest version of the database.  We're not the only
ones stuck with lousy old functions (C finally ditched gets() in the
2011 standard).  I also happen to think the current array_api function
names are not particularly great (especially array_upper/array_lower)
so I won't shed too many tears.

merlin



On Jun 12, 2013, at 8:00 PM, Brendan Jurd <direvus@gmail.com> wrote:

> array_dims - Returns the dimensions of the array, unless it is empty
> in which case NULL.
> array_proper_dims - Returns the dimensions of the array.
> array_ndims - Returns the number of dimension, unless it is empty in
> which case NULL.
> array_proper_ndims - Returns the number of dimensions.
>
> ... and so on for _length, _upper and _lower.

Similar things should have dissimilar names. I propose:

<bikeshedding>
    Old      |    New
--------------+--------------array_dims   | array_descarray_ndims  | array_deptharray_length | array_sizearray_lower  |
array_startarray_upper | array_finish 

The last two are meh, but it’s a place to start…

</bikeshedding>

David




On 14 June 2013 03:53, David E. Wheeler <david@justatheory.com> wrote:
> Similar things should have dissimilar names. I propose:
>
> <bikeshedding>
>
>      Old      |    New
> --------------+--------------
>  array_dims   | array_desc

array_bounds?

>  array_ndims  | array_depth
>  array_length | array_size
>  array_lower  | array_start
>  array_upper  | array_finish
>
> The last two are meh, but it’s a place to start…

I think that even with the most dissimilar names we can come up with,
this is going to confuse people.  But it is still better than doing
nothing.

I wonder whether, if we go in this direction, we could still use some
of the work I did on deprecating zero-D arrays.  Let's say the old
functions keep doing what they do now, and we teach them to treat all
empty arrays the same way they currently treat zero-D arrays (return
NULL).  The new functions treat zero-D arrays as though they were 1-D
empty with default bounds, and we add CARDINALITY per ArrayGetNItems.

This way, applications would not be broken by upgrading, and we'd be
giving people a way to opt-in to a better API.

Cheers,
BJ



On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote:
> >> But, couldn't that be solved by deprecating that function and
> >> providing a more sensible alternatively named version?
> >
> > And what would you name that function?  array_dims2?  I can't think of
> > a name that makes the difference in behaviour apparent.  Can you
> > imagine the documentation for that?
> 
> I don't know the answer to that, but I think it's hard to argue that
> deprecating and documenting a few functions is a heavier burden on
> your users than having to sift through older arcane code before
> upgrading to the latest version of the database.  We're not the only
> ones stuck with lousy old functions (C finally ditched gets() in the
> 2011 standard).  I also happen to think the current array_api function
> names are not particularly great (especially array_upper/array_lower)
> so I won't shed too many tears.

Sorry to be late on this, but are you saying people have code that is
testing:
select array_dims('{}'::int[])

for a NULL return, and they would need to change that to test for zero?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



On Thu, Jun 20, 2013 at 2:58 PM, Bruce Momjian <bruce@momjian.us> wrote:
> On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote:
>> >> But, couldn't that be solved by deprecating that function and
>> >> providing a more sensible alternatively named version?
>> >
>> > And what would you name that function?  array_dims2?  I can't think of
>> > a name that makes the difference in behaviour apparent.  Can you
>> > imagine the documentation for that?
>>
>> I don't know the answer to that, but I think it's hard to argue that
>> deprecating and documenting a few functions is a heavier burden on
>> your users than having to sift through older arcane code before
>> upgrading to the latest version of the database.  We're not the only
>> ones stuck with lousy old functions (C finally ditched gets() in the
>> 2011 standard).  I also happen to think the current array_api function
>> names are not particularly great (especially array_upper/array_lower)
>> so I won't shed too many tears.
>
> Sorry to be late on this, but are you saying people have code that is
> testing:
>
>         select array_dims('{}'::int[])
>
> for a NULL return, and they would need to change that to test for zero?

Kinda -- what I'm saying is you just don't go around changing function
behaviors to make them 'better' unless the affected behavior was
specifically reserved as undefined.  The fact is nobody knows how many
users will be affected and the extent of the ultimate damage (pro tip:
it's always more and worse than expected); I'm astonished it's even
being considered.

merlin



On Thu, Jun 20, 2013 at 03:33:24PM -0500, Merlin Moncure wrote:
> On Thu, Jun 20, 2013 at 2:58 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > On Thu, Jun 13, 2013 at 11:57:27AM -0500, Merlin Moncure wrote:
> >> >> But, couldn't that be solved by deprecating that function and
> >> >> providing a more sensible alternatively named version?
> >> >
> >> > And what would you name that function?  array_dims2?  I can't think of
> >> > a name that makes the difference in behaviour apparent.  Can you
> >> > imagine the documentation for that?
> >>
> >> I don't know the answer to that, but I think it's hard to argue that
> >> deprecating and documenting a few functions is a heavier burden on
> >> your users than having to sift through older arcane code before
> >> upgrading to the latest version of the database.  We're not the only
> >> ones stuck with lousy old functions (C finally ditched gets() in the
> >> 2011 standard).  I also happen to think the current array_api function
> >> names are not particularly great (especially array_upper/array_lower)
> >> so I won't shed too many tears.
> >
> > Sorry to be late on this, but are you saying people have code that is
> > testing:
> >
> >         select array_dims('{}'::int[])
> >
> > for a NULL return, and they would need to change that to test for zero?
> 
> Kinda -- what I'm saying is you just don't go around changing function
> behaviors to make them 'better' unless the affected behavior was
> specifically reserved as undefined.  The fact is nobody knows how many
> users will be affected and the extent of the ultimate damage (pro tip:
> it's always more and worse than expected); I'm astonished it's even
> being considered.

Well, I think the question is how many people have such arrays that will
be effected.  If we don't do something, we live with this odd behavior
forever.  We have been willing to make some bold decisions in the past
to improve user experience, and it mostly has worked out well.  I
disagree that it is always worse than expected.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



On Thu, Jun 20, 2013 at 6:40 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> Kinda -- what I'm saying is you just don't go around changing function
>> behaviors to make them 'better' unless the affected behavior was
>> specifically reserved as undefined.  The fact is nobody knows how many
>> users will be affected and the extent of the ultimate damage (pro tip:
>> it's always more and worse than expected); I'm astonished it's even
>> being considered.
>
> Well, I think the question is how many people have such arrays that will
> be effected.  If we don't do something, we live with this odd behavior
> forever.  We have been willing to make some bold decisions in the past
> to improve user experience, and it mostly has worked out well.  I
> disagree that it is always worse than expected.

Well, you can have the last word (although 'bold' was an interesting
word choice, heh)  -- I feel guilty enough about beating up Brendan
already.  I feel this way every time compatibility changes come up, so
it's nothing specific to this patch really.

merlin



On Thu, Jun 20, 2013 at 07:13:48PM -0500, Merlin Moncure wrote:
> On Thu, Jun 20, 2013 at 6:40 PM, Bruce Momjian <bruce@momjian.us> wrote:
> >> Kinda -- what I'm saying is you just don't go around changing function
> >> behaviors to make them 'better' unless the affected behavior was
> >> specifically reserved as undefined.  The fact is nobody knows how many
> >> users will be affected and the extent of the ultimate damage (pro tip:
> >> it's always more and worse than expected); I'm astonished it's even
> >> being considered.
> >
> > Well, I think the question is how many people have such arrays that will
> > be effected.  If we don't do something, we live with this odd behavior
> > forever.  We have been willing to make some bold decisions in the past
> > to improve user experience, and it mostly has worked out well.  I
> > disagree that it is always worse than expected.
> 
> Well, you can have the last word (although 'bold' was an interesting
> word choice, heh)  -- I feel guilty enough about beating up Brendan
> already.  I feel this way every time compatibility changes come up, so
> it's nothing specific to this patch really.

Well, sometimes we underestimate the impact of changes, sometimes we
overestimate.  The big problem is weighing the short-term problems of
change but not the long-term benefit of a change.  This array problem
goes back to at least 2008:
http://www.postgresql.org/message-id/28026.1224611437@sss.pgh.pa.us

so we have at least five years of confusion by not changing it then.  I
am not saying we need to change it, but do think we need to weigh both
issues.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Bruce,

> Well, sometimes we underestimate the impact of changes, sometimes we
> overestimate.  The big problem is weighing the short-term problems of
> change but not the long-term benefit of a change.  This array problem
> goes back to at least 2008:
> 
>     http://www.postgresql.org/message-id/28026.1224611437@sss.pgh.pa.us
> 
> so we have at least five years of confusion by not changing it then.  I
> am not saying we need to change it, but do think we need to weigh both
> issues.

As much as I hate the current behavior (my first response was "yeah, fix
those babies!"), I think we don't have a choice about creating new
function names and then waiting three years to deprecate the old ones.
We really can't afford to put obstacles in the way of people upgrading,
especially over an issue as minor as this one.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



On Thu, Jun 20, 2013 at 06:28:07PM -0700, Josh Berkus wrote:
> Bruce,
> 
> > Well, sometimes we underestimate the impact of changes, sometimes we
> > overestimate.  The big problem is weighing the short-term problems of
> > change but not the long-term benefit of a change.  This array problem
> > goes back to at least 2008:
> > 
> >     http://www.postgresql.org/message-id/28026.1224611437@sss.pgh.pa.us
> > 
> > so we have at least five years of confusion by not changing it then.  I
> > am not saying we need to change it, but do think we need to weigh both
> > issues.
> 
> As much as I hate the current behavior (my first response was "yeah, fix
> those babies!"), I think we don't have a choice about creating new
> function names and then waiting three years to deprecate the old ones.
> We really can't afford to put obstacles in the way of people upgrading,
> especially over an issue as minor as this one.

Perhaps we need to mark the TODO item as "will not fix".

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +