Re: Force re-compression with lz4 - Mailing list pgsql-general

From Florents Tselai
Subject Re: Force re-compression with lz4
Date
Msg-id D5E2909A-B652-432A-AAD0-2461858F0BAA@gmail.com
Whole thread Raw
In response to Re: Force re-compression with lz4  (Ron <ronljohnsonjr@gmail.com>)
Responses Re: Force re-compression with lz4
List pgsql-general
Yes, That COPY-delete-COPY sequence is what I ended up doing.
Unfortunately can’t use ranges as the PK its a text string.

On 17 Oct 2021, at 7:36 PM, Ron <ronljohnsonjr@gmail.com> wrote:

On 10/17/21 10:12 AM, Florents Tselai wrote:
Hello,

I have a table storing mostly text data (40M+ rows) that has pg_total_relation_size ~670GB.
I’ve just upgraded to postgres 14 and I’m now eager to try the new LZ4 compression.

I’ve altered the column to use the new lz4 compression, but that only applies to new rows.

What’s the recommended way of triggering the re-evaluation for pre-existing rows? 

I tried wrapping a function like the following, but apparently each old record retains the compression applied.
text_corpus=(SELECT t.text from ...);

delete from t where id=;

insert into t(id, text) values (id, text_corpus);

Because it's all in one transaction?

Fttb, I resorted to preparing an external shell script to execute against the db but that’s too slow as it moves data in&out the db.

Is there a smarter way to do this ?

Even with in-place compression, you've got to read the uncompressed data.

Does your shell script process one record at a time?  Maybe do ranges:
COPY (SELECT * FROM t WHERE id BETWEEN x AND y) TO '/some/file.csv';
DELETE FROM t WHERE id BETWEEN x AND y;
COPY t FROM '/some/file.csv';

--
Angular momentum makes the world go 'round.

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Force re-compression with lz4
Next
From: Florents Tselai
Date:
Subject: Re: Force re-compression with lz4