PostgreSQL Arrays and Performance - Mailing list pgsql-general

From Marc Philipp
Subject PostgreSQL Arrays and Performance
Date
Msg-id 927255D6-5FD9-4D06-93B6-C3FE25395C61@marcphilipp.de
Whole thread Raw
Responses Re: PostgreSQL Arrays and Performance  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: PostgreSQL Arrays and Performance  (Joe Conway <mail@joeconway.com>)
List pgsql-general
A few performance issues using PostgreSQL's arrays led us to the
question how postgres actually stores variable length arrays. First,
let me explain our situation.

We have a rather large table containing a simple integer primary key
and a couple more columns of fixed size. However, there is a dates
column of type "timestamp without time zone[]" that is apparently
causing some severe performance problems.

During a daily update process new timestamps are collected and
existing data rows are being updated (new rows are also being added).
These changes affect a large percentage of the existing rows.

What we have been observing in the last few weeks is, that the
overall database size is increasing rapidly due to this table and
vacuum processes seem to deadlock with other processes querying data
from this table.

Therefore, the the database keeps growing and becomes more and more
unusable. The only thing that helps is dumping and restoring it which
is nothing you are eager to do on a large live system and a daily basis.

This problem led us to the question, how these arrays are stored
internally. Are they stored "in-place" with the other columns or
merely as a pointer to another file?

Would it be more efficient to not use an array for this purpose but
split the table in two parts?

Any help is appreciated!


Marc Philipp

pgsql-general by date:

Previous
From: Arnaud Lesauvage
Date:
Subject: initdb : invalid local name
Next
From: Albert Vernon Smith
Date:
Subject: Re: insert serial numbers