Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id 20141212150429.GL31413@awork2.anarazel.de
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: [REVIEW] Re: Compression of full-page-writes  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2014-12-12 23:50:43 +0900, Michael Paquier wrote:
> I got curious to see how the compression of an entire record would perform
> and how it compares for small WAL records, and here are some numbers based
> on the patch attached, this patch compresses the whole record including the
> block headers, letting only XLogRecord out of it with a flag indicating
> that the record is compressed (note that this patch contains a portion for
> replay untested, still this patch gives an idea on how much compression of
> the whole record affects user CPU in this test case). It uses a buffer of 4
> * BLCKSZ, if the record is longer than that compression is simply given up.
> Those tests are using the hack upthread calculating user and system CPU
> using getrusage() when a backend.
> 
> Here is the simple test case I used with 512MB of shared_buffers and small
> records, filling up a bunch of buffers, dirtying them and them compressing
> FPWs with a checkpoint.
> #!/bin/bash
> psql <<EOF
> SELECT pg_backend_pid();
> CREATE TABLE aa (a int);
> CREATE TABLE results (phase text, position pg_lsn);
> CREATE EXTENSION IF NOT EXISTS pg_prewarm;
> ALTER TABLE aa SET (FILLFACTOR = 50);
> INSERT INTO results VALUES ('pre-insert', pg_current_xlog_location());
> INSERT INTO aa VALUES (generate_series(1,7000000)); -- 484MB
> SELECT pg_size_pretty(pg_relation_size('aa'::regclass));
> SELECT pg_prewarm('aa'::regclass);
> CHECKPOINT;
> INSERT INTO results VALUES ('pre-update', pg_current_xlog_location());
> UPDATE aa SET a = 7000000 + a;
> CHECKPOINT;
> INSERT INTO results VALUES ('post-update', pg_current_xlog_location());
> SELECT * FROM results;
> EOF
> 
> Note that autovacuum and fsync are off.
> =# select phase, user_diff, system_diff,
> pg_size_pretty(pre_update - pre_insert),
> pg_size_pretty(post_update - pre_update) from results;
>        phase        | user_diff | system_diff | pg_size_pretty |
> pg_size_pretty
> --------------------+-----------+-------------+----------------+----------------
>  Compression FPW    | 42.990799 |    0.868179 | 429 MB         | 567 MB
>  No compression     | 25.688731 |    1.236551 | 429 MB         | 727 MB
>  Compression record | 56.376750 |    0.769603 | 429 MB         | 566 MB
> (3 rows)
> If we do record-level compression, we'll need to be very careful in
> defining a lower-bound to not eat unnecessary CPU resources, perhaps
> something that should be controlled with a GUC. I presume that this stands
> true as well for the upper bound.

Record level compression pretty obviously would need a lower boundary
for when to use compression. It won't be useful for small heapam/btree
records, but it'll be rather useful for large multi_insert, clean or
similar records...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: pg_rewind in contrib
Next
From: Tom Lane
Date:
Subject: Re: pg_rewind in contrib