Re: [REVIEW] Re: Compression of full-page-writes - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: [REVIEW] Re: Compression of full-page-writes
Date
Msg-id CAHGQGwHXvT4eYOZ7G-wcBa-s43KHb9O0XauqRxfH4R8ZT36jjA@mail.gmail.com
Whole thread Raw
In response to Re: [REVIEW] Re: Compression of full-page-writes  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Responses Re: [REVIEW] Re: Compression of full-page-writes
List pgsql-hackers
On Wed, Jul 23, 2014 at 5:21 PM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
> 1. Need for compressing full page backups:
> There are good number of benchmarks done by various people on this list
> which clearly shows the need of the feature. Many people have already voiced
> their agreement on having this in core, even as a configurable parameter.

Yes!

> Having said that, IMHO we should go one step at a time. We are using pglz
> for compressing toast data for long, so we can continue to use the same for
> compressing full page images. We can simultaneously work on adding more
> algorithms to core and choose the right candidate for different scenarios
> such as toast or FPW based on test evidences. But that work can happen
> independent of this patch.

This gradual approach looks good to me. And, if the additional compression
algorithm like lz4 is always better than pglz for every scenarios, we can just
change the code so that the additional algorithm is always used. Which would
make the code simpler.

> 3. Compressing one block vs all blocks:
> Andres suggested that compressing all backup blocks in one go may give us
> better compression ratio. This is worth trying. I'm wondering what would the
> best way to do so without minimal changes to the xlog insertion code. Today,
> we add more rdata items for backup block header(s) and backup blocks
> themselves (if there is a "hole" then 2 per backup block) beyond what the
> caller has supplied. If we have to compress all the backup blocks together,
> then one approach is to copy the backup block headers and the blocks to a
> temp buffer, compress that and replace the rdata entries added previously
> with a single rdata.

Basically sounds reasonable. But, how does this logic work if there are
multiple rdata and only some of them are backup blocks?

If a "hole" is not copied to that temp buffer, ISTM that we should
change backup block header  so that it contains the info for a
"hole", e.g., location that a "hole" starts. No?

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: WAL format and API changes (9.5)
Next
From: Kevin Grittner
Date:
Subject: Re: PostrgeSQL vs oracle doing 1 million sqrts am I doing it wrong?