Re: Trading off large objects (arrays, large strings, large tables) for timeseries - Mailing list pgsql-general

From Tom Lane
Subject Re: Trading off large objects (arrays, large strings, large tables) for timeseries
Date
Msg-id 24072.1108567036@sss.pgh.pa.us
Whole thread Raw
In response to Re: Trading off large objects (arrays, large strings, large tables) for timeseries  (Antonios Christofides <anthony@itia.ntua.gr>)
List pgsql-general
Antonios Christofides <anthony@itia.ntua.gr> writes:
> CREATE TABLE test(id integer not null primary key, records text[]);

> UPDATE test SET records[2000007] = 'hello, world!';
> [11 seconds]
> UPDATE test SET records[1000000] = 'hello, world!';
> [15 seconds (but the difference may be because of system load - I
> don't have a completely idle machine available right now)]

> I thought the two above UPDATE commands would be instant.

Hardly likely seeing that text[] has variable-length array entries;
the only way to isolate and replace the 2000007'th entry is to scan
through all the ones before it.  However the fact that the two cases
seem to be about the same speed suggests to me that the bulk of the time
is going into loading/decompressing/compressing/storing the array datum.

You might try experimenting with the column storage option (see ALTER
TABLE) --- using EXTERNAL instead of the default EXTENDED would suppress
the compression/decompression step.  I suspect that will be a net loss
because it will eliminate CPU overhead by trading it off for more I/O
... but it would be worth doing the experiment to find out.

On the whole though, I think you're going to have to abandon this
approach.  The TOAST mechanism really isn't designed to support partial
updates of huge fields efficiently.  It forces any change in the value
to be an update of the whole value.

            regards, tom lane

pgsql-general by date:

Previous
From: "Ed L."
Date:
Subject: Re: insert data from an microsoft excel
Next
From: David Fetter
Date:
Subject: Re: insert data from an microsoft excel