Re: Inconsistent behavior on Array & Is Null? - Mailing list pgsql-hackers

From Joe Conway
Subject Re: Inconsistent behavior on Array & Is Null?
Date
Msg-id 406F1882.2040904@joeconway.com
Whole thread Raw
In response to Re: Inconsistent behavior on Array & Is Null?  (Greg Stark <gsstark@mit.edu>)
Responses Re: Inconsistent behavior on Array & Is Null?
List pgsql-hackers
Greg Stark wrote:
> Joe Conway <mail@joeconway.com> writes:
>>I agree. I had always envisioned something exactly like that once we supported
>>NULL elements. As far as the implementation goes, I think it would be very
>>similar to tuples -- a null bitmask that would exist if any elements are NULL.
> 
> Well you might still want to store an internal "all indexes below this are
> null". That way update foo set a[1000]=1 doesn't require storing even a bitmap
> for the first 999 elements. Though might make maintaining the bitmap kind of a
> pain. Maintaining the bitmap might be kind of a pain anyways though because
> unlike tuples the array size isn't constant.

I don't think it will be worth the complication to do other than a 
straight bitmap -- at least not the first attempt.

>>A related question is how to deal with non-existing array elements. Until now,
>>you could do:
> 
> I would have to think about it some more, but my first reaction is that
> looking up [0] should generate an error if there can never be a valid entry at
> [0]. But looking up indexes above the highest index should return NULL.
> 
> There are two broad use cases I see for arrays. Using them to represent tuples
> where a[i] means something specific for each i, and using them to represent
> sets where order doesn't matter.
> 
> In the former case I might want to initialize my column to an empty array and
> set only the relevant columns as needed. In that case returning NULL for
> entries that haven't been set yet whether they're above the last entry set or
> below is most consistent.

Maybe, but you're still going to need to explicitly set the real upper 
bound element in order for the length/cardinality to be correct. In 
other words, if you really want an array with elements 1 to 1000, but 2 
through 1000 are NULL, you'll need to explicitly set A[1000] = NULL; 
otherwise we'll have no way of knowing that you really want 1000 
elements. Perhaps we'll want some kind of array_init function to create 
an array of a given size filled with all NULL elements (or even some 
arbitrary constant element).

I'd think given the preceding, it would make more sense to throw an 
error whenever trying to access an element greater than the length.

> In the latter case you really don't want to be looking up anything past the
> end and don't want to be storing NULLs at all. So it doesn't really matter
> what the behaviour is for referencing elements past the end, but you might
> conceivably want to write code like "while (e = a[i++]) ...".

See reasoning as above. And if you did somehow wind up with a "real" 
NULL element in this scenario, you'd never know about it. The looping 
could always be:  while (i++ <= length)
or  for (i = 1; i <= length, i++)

Joe



pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: Better support for whole-row operations and composite
Next
From: Joe Conway
Date:
Subject: Re: Better support for whole-row operations and composite