Re: [PATCHES] Full page writes improvement, code update - Mailing list pgsql-hackers

From Koichi Suzuki
Subject Re: [PATCHES] Full page writes improvement, code update
Date
Msg-id 460C611E.6090800@oss.ntt.co.jp
Whole thread Raw
In response to Re: [PATCHES] Full page writes improvement, code update  ("Simon Riggs" <simon@2ndquadrant.com>)
Responses Re: [PATCHES] Full page writes improvement, code update
Re: [PATCHES] Full page writes improvement, code update
List pgsql-hackers
Josh;

I'd like to explain what the term "compression" in my proposal means
again and would like to show the resource consumption comparision with
cp and gzip.

My proposal is to remove unnecessary full page writes (they are needed
in crash recovery from inconsistent or partial writes) when we copy WAL
to archive log and rebuilt them as a dummy when we restore from archive
log.  Dummy is needed to maintain LSN.  So it is very very different
from general purpose compression such as gzip, although pg_compresslog
compresses archive log as a result.

As to CPU and I/O consumption, I've already evaluated as follows:

1) Collect all the WAL segment.
2) Copy them by different means, cp, pg_compresslog and gzip.

and compared the ellapsed time as well as other resource consumption.

Benchmark: DBT-2
Database size: 120WH (12.3GB)
Total WAL size: 4.2GB (after 60min. run)
Elapsed time:
   cp:            120.6sec
   gzip:          590.0sec
   pg_compresslog: 79.4sec
Resultant archive log size:
   cp:             4.2GB
   gzip:           2.2GB
   pg_compresslog: 0.3GB
Resource consumption:
   cp:   user:   0.5sec system: 15.8sec idle:  16.9sec I/O wait: 87.7sec
   gzip: user: 286.2sec system:  8.6sec idle: 260.5sec I/O wait: 36.0sec
   pg_compresslog:
         user:   7.9sec system:  5.5sec idle:  37.8sec I/O wait: 28.4sec

Because the resultant log size is considerably smaller than cp or gzip,
pg_compresslog need much less I/O and because the logic is much simpler
than gzip, it does not consume CPU.

The term "compress" may not be appropriate.   We may call this "log
optimization" instead.

So I don't see any reason why this (at least optimization "mark" in each
log record) can't be integrated.

Simon Riggs wrote:
> On Thu, 2007-03-29 at 11:45 -0700, Josh Berkus wrote:
>
>>> OK, different question:
>>> Why would anyone ever set full_page_compress = off?
>> The only reason I can see is if compression costs us CPU but gains RAM &
>> I/O.  I can think of a lot of applications ... benchmarks included ...
>> which are CPU-bound but not RAM or I/O bound.  For those applications,
>> compression is a bad tradeoff.
>>
>> If, however, CPU used for compression is made up elsewhere through smaller
>> file processing, then I'd agree that we don't need a switch.

As I wrote to Simon's comment, I concern only one thing.

Without a switch, because both full page writes and corresponding
logical log is included in WAL, this will increase WAL size slightly
(maybe about five percent or so).   If everybody is happy with this, we
don't need a switch.

>
> Koichi-san has explained things for me now.
>
> I misunderstood what the parameter did and reading your post, ISTM you
> have as well. I do hope Koichi-san will alter the name to allow
> everybody to understand what it does.
>

Here're some candidates:
full_page_writes_optimize
full_page_writes_mark: means it marks full_page_write as "needed in
crash recovery", "needed in archive recovery" and so on.

I don't insist these names.  It's very helpful if you have any
suggestion to reflect what it really means.

Regards;
--
Koichi Suzuki

pgsql-hackers by date:

Previous
From: "Florian G. Pflug"
Date:
Subject: Re: CREATE INDEX and HOT - revised design
Next
From: Koichi Suzuki
Date:
Subject: Re: [PATCHES] Full page writes improvement, code update