Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id CAB7nPqRC20=mKgu6d2st-e11_QqqbreZg-=SF+_UYsmvwNu42g@mail.gmail.com
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: [REVIEW] Re: Compression of full-page-writes  (Rahila Syed <rahilasyed90@gmail.com>)
List pgsql-hackers

On Sat, Dec 6, 2014 at 12:17 AM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2014-12-06 00:10:11 +0900, Michael Paquier wrote:
> On Sat, Dec 6, 2014 at 12:06 AM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>
> >
> >
> >
> > On Fri, Dec 5, 2014 at 11:10 PM, Rahila Syed <rahilasyed.90@gmail.com>
> > wrote:
> >
> >> I attempted quick review and could not come up with much except this
> >>
> >> +   /*
> >> +    * Calculate the amount of FPI data in the record. Each backup block
> >> +    * takes up BLCKSZ bytes, minus the "hole" length.
> >> +    *
> >> +    * XXX: We peek into xlogreader's private decoded backup blocks for
> >> the
> >> +    * hole_length. It doesn't seem worth it to add an accessor macro for
> >> +    * this.
> >> +    */
> >> +   fpi_len = 0;
> >> +   for (block_id = 0; block_id <= record->max_block_id; block_id++)
> >> +   {
> >> +       if (XLogRecHasCompressedBlockImage(record, block_id))
> >> +           fpi_len += BLCKSZ - record->blocks[block_id].compress_len;
> >>
> >>
> >> IIUC, fpi_len in case of compressed block image should be
> >>
> >> fpi_len = record->blocks[block_id].compress_len;
> >>
> > Yep, true. Patches need a rebase btw as Heikki fixed a commit related to
> > the stats of pg_xlogdump.
> >
>
> In any case, any opinions to switch this patch as "Ready for committer"?

Needing a rebase is a obvious conflict to that... But I guess some wider
looks afterwards won't hurt.
 
Here are rebased versions, which are patches 1 and 2. And I am switching as well the patch to "Ready for Committer". The important point to consider for this patch is the use of the additional 2-bytes as uint16 in the block information structure to save the length of a compressed block, which may be compressed without its hole to achieve a double level of compression (image compressed without its hole). We may use a simple flag on one or two bits using for example a bit from hole_length, but in this case we would need to always compress images with their hole included, something more expensive as the compression would take more time.

Robert wrote:
> What I would suggest is instrument the backend with getrusage() at
> startup and shutdown and have it print the difference in user time and
> system time.  Then, run tests for a fixed number of transactions and
> see how the total CPU usage for the run differs.
That's a nice idea, which is done with patch 3 as a simple hack calling twice getrusage at the beginning of PostgresMain and before proc_exit, calculating the difference time and logging it for each process (used as well log_line_prefix with %p).

Then I just did a small test with a load of a pgbench-scale-100 database on fresh instances:
1) Compression = on:
Stop LSN: 0/487E49B8
getrusage: proc 11163: LOG:  user diff: 63.071127, system diff: 10.898386
pg_xlogdump: FPI size: 122296653 [90.52%]
2) Compression = off
Stop LSN: 0/4E54EB88
Result: proc 11648: LOG:  user diff: 43.855212, system diff: 7.857965
pg_xlogdump: FPI size: 204359192 [94.10%]
And the CPU consumption is showing quite some difference... I'd expect as well pglz_compress to show up high in a perf profile for this case (don't have the time to do that now, but a perf record -a -g would be fine I guess).
Regards,
--
Michael
Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Parallel Seq Scan
Next
From: Peter Eisentraut
Date:
Subject: Re: superuser() shortcuts