Thread: space taken by NULL values in array

space taken by NULL values in array

From
Leonardo Francalanci
Date:
Hi,


since postgresql multidimensional arrays can't have different size per axis, I
was wondering what would happen in case I used an array of, say, 10x10
elements, where only 10x2 elements are filled and the rest are NULL. I guess
the NULL elements take space (and I would have 80% of the space wasted),
but I'm not sure...

Can someone confirm?




Re: space taken by NULL values in array

From
John R Pierce
Date:
  On 09/20/10 12:58 AM, Leonardo Francalanci wrote:
> Hi,
>
>
> since postgresql multidimensional arrays can't have different size per axis,...

huh?  says what?

i thought PG multidimensional arrays were just arrays of arrays, and any
dimension could be anything.

in fact, the docs for 8.4 state that array declarations ignore specified
dimensions, the rows and columns simply are what they are.






Re: space taken by NULL values in array

From
Leonardo Francalanci
Date:
> i thought PG multidimensional  arrays were just arrays of arrays, and any
>dimension could be  anything.


from:

http://www.postgresql.org/docs/8.4/static/arrays.html

"Multidimensional arrays must have matching extents for each dimension. A
mismatch causes an error"





Re: space taken by NULL values in array

From
John R Pierce
Date:
  On 09/20/10 1:24 AM, Leonardo Francalanci wrote:
>> i thought PG multidimensional  arrays were just arrays of arrays, and any
>> dimension could be  anything.
>
> from:
>
> http://www.postgresql.org/docs/8.4/static/arrays.html
>
> "Multidimensional arrays must have matching extents for each dimension. A
> mismatch causes an error"
>

maybe thats poorly worded.

the example of that error shows an array with 2 columns in the first row
and 1 column in the 2nd row.

   '{{"meeting", "lunch"}, {"meeting"}}'


thats not a valid array.

    '{{"meeting", "lunch"}, {"training", "presentation"}}'

is a valid array.



Re: space taken by NULL values in array

From
Merlin Moncure
Date:
On Mon, Sep 20, 2010 at 4:12 AM, John R Pierce <pierce@hogranch.com> wrote:
>  On 09/20/10 12:58 AM, Leonardo Francalanci wrote:
>>
>> Hi,
>>
>>
>> since postgresql multidimensional arrays can't have different size per
>> axis,...
>
> huh?  says what?
>
> i thought PG multidimensional arrays were just arrays of arrays, and any
> dimension could be anything.
>
> in fact, the docs for 8.4 state that array declarations ignore specified
> dimensions, the rows and columns simply are what they are.


Yeah -- to get to the nitty gritty details -- postgresql arrays are
essentially a big stack of datums (technically, {length, datum} with
length doubling as the null signal)  and a vector of dimensions of
length zero to six.  The null elements take space, but only for the
length/null not the element itself, which can be skipped.

One possible optimization is to take advantage of cases where all the
elements are known not null, especially if the datums are fixed
length.  You could it down almost 50% in some cases.  If you have
nulls mixed in I don't see an easy way to improve over what we've
already got.  If you're fretting about it, maybe look at array
alternatives for storage.

If you truly want array of arrays, wrap your inner array in a composite type:
create type a as (v int[]);
create table b(v a[]);

insert into b values (array[row(array[1,2,3]), row(array[2,4])]::a[]);

postgres# select (unnest(v)).* from b;
    v
---------
 {1,2,3}
 {2,4}
(2 rows)

merlin