Thread: Array access performance

Array access performance

From
Andreas Brandl
Date:
Hi,

I'm looking for a hint how array access performs in PostgreSQL in respect to performance. Normally I would expect
accessof a 1-dimensional Array at slot i (array[i]) to perform in constant time (random access). 

Is this also true for postgres' arrays?

May concrete example is a 1-dimensional array d of length <= 600 (which will grow at a rate of 1 entry/day) stored in a
table'scolumn. I need to access this array two times per tuple, i.e. d[a], d[b]. Therefore I hope access is not linear.
Isthis correct? 

Also I'm having some performance issues building this array. I'm doing this with a used-defined aggregate function,
startingwith an empty array and using array_append and some calculation for each new entry. I assume this involves some
copying/memoryallocation on each call, but I could not find the implementation of array_append in postgres-source/git.  

Is there an efficient way to append to an array? I could also start with a pre-initialized array of the required
length,but this involves some complexity. 

Thank you

Regards,
Andreas

Re: Array access performance

From
Andreas Brandl
Date:
> Is this also true for postgres' arrays?

Sorry, I'm using latest postgres 9.0.4 on debian squeeze/amd64.

Greetings
Andreas

Re: Array access performance

From
Tom Lane
Date:
Andreas Brandl <ml@3.141592654.de> writes:
> I'm looking for a hint how array access performs in PostgreSQL in respect to performance. Normally I would expect
accessof a 1-dimensional Array at slot i (array[i]) to perform in constant time (random access). 

> Is this also true for postgres' arrays?

Only if the element type is fixed-length (no strings for instance) and
the array does not contain, and never has contained, any nulls.
Otherwise a scan through all the previous elements is required to find
a particular element.

By and large, if you're thinking of using arrays large enough to make
this an interesting question, I would say stop right there and redesign
your database schema.  You're not thinking relationally, and it's gonna
cost ya.

            regards, tom lane

Re: Array access performance

From
Andreas Brandl
Date:
Hi Tom,

> > I'm looking for a hint how array access performs in PostgreSQL in
> > respect to performance. Normally I would expect access of a
> > 1-dimensional Array at slot i (array[i]) to perform in constant time
> > (random access).
>
> > Is this also true for postgres' arrays?
>
> Only if the element type is fixed-length (no strings for instance) and
> the array does not contain, and never has contained, any nulls.
> Otherwise a scan through all the previous elements is required to find
> a particular element.

We're using bigint elements here and don't have nulls, so this should be fine.

> By and large, if you're thinking of using arrays large enough to make
> this an interesting question, I would say stop right there and
> redesign
> your database schema. You're not thinking relationally, and it's gonna
> cost ya.

In general, I agree. We're having a nice relational database but are facing some perfomance issues. My approach is to
builda materialized view which exploits the array feature and heavily relies on constant time access on arrays. 

Thank you!

Regards,
Andreas