Thread: Arrays and Performance

Arrays and Performance

From
s_philip@ira.uka.de
Date:
A few performance issues using PostgreSQL's arrays led us to the
question how postgres actually stores variable length arrays. First,
let me explain our situation.

We have a rather large table containing a simple integer primary key
and a couple more columns of fixed size. However, there is a dates
column of type "timestamp without time zone[]" that is apparently
causing some severe performance problems.

During a daily update process new timestamps are collected and
existing data rows are being updated (new rows are also being added).
These changes affect a large percentage of the existing rows.

What we have been observing in the last few weeks is, that the
overall database size is increasing rapidly due to this table and
vacuum processes seem to deadlock with other processes querying data
from this table.

Therefore, the the database keeps growing and becomes more and more
unusable. The only thing that helps is dumping and restoring it which
is nothing you are eager to do on a large live system and a daily basis.

This problem led us to the question, how these arrays are stored
internally. Are they stored "in-place" with the other columns or
merely as a pointer to another file?

Would it be more efficient to not use an array for this purpose but
split the table in two parts?

Any help is appreciated!


Marc Philipp

----------------------------------------------------------------
This message was sent using ATIS-Webmail: http://www.atis.uka.de

Re: Arrays and Performance

From
Joe Conway
Date:
s_philip@ira.uka.de wrote:
> Would it be more efficient to not use an array for this purpose but
> split the table in two parts?
>
> Any help is appreciated!

This is a duplicate of your post from the other day, to which I
responded, as did Tom Lane:

http://archives.postgresql.org/pgsql-general/2006-01/msg00104.php
http://archives.postgresql.org/pgsql-general/2006-01/msg00108.php

Did you not receive those replies?

Joe

Re: Arrays and Performance

From
"Jim C. Nasby"
Date:
On Fri, Jan 06, 2006 at 09:43:53AM +0100, s_philip@ira.uka.de wrote:
> What we have been observing in the last few weeks is, that the
> overall database size is increasing rapidly due to this table and
> vacuum processes seem to deadlock with other processes querying data
> from this table.

Are you seeing deadlock errors? How often are you vacuuming?
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Arrays and Performance

From
Marc Philipp
Date:
Sorry for the duplicate post! My first post was stalled and my mail
server down for a day or so. I will reply to your original posts.

Regards, Marc Philipp


Re: Arrays and Performance

From
Marc Philipp
Date:
No, we don't get deadlock errors, but when running a vacuum and another
process writing into the database there progress will stop at some point
and nothing happens until one process is being killed.

I think we used to vacuum every two nights and did a full vacuum once a
week.

Regards, Marc Philipp