Re: Compression of full-page-writes - Mailing list pgsql-hackers

From ktm@rice.edu
Subject Re: Compression of full-page-writes
Date
Msg-id 20131024171929.GI2790@aart.rice.edu
Whole thread Raw
In response to Re: Compression of full-page-writes  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Oct 24, 2013 at 12:22:59PM -0400, Robert Haas wrote:
> On Thu, Oct 24, 2013 at 11:40 AM, ktm@rice.edu <ktm@rice.edu> wrote:
> > On Thu, Oct 24, 2013 at 11:07:38AM -0400, Robert Haas wrote:
> >> On Mon, Oct 21, 2013 at 11:52 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> >> > So, our consensus is to introduce the hooks for FPW compression so that
> >> > users can freely select their own best compression algorithm?
> >> > Also, probably we need to implement at least one compression contrib module
> >> > using that hook, maybe it's based on pglz or snappy.
> >>
> >> I don't favor making this pluggable. I think we should pick snappy or
> >> lz4 (or something else), put it in the tree, and use it.
> >>
> > Hi,
> >
> > My vote would be for lz4 since it has faster single thread compression
> > and decompression speeds with the decompression speed being almost 2X
> > snappy's decompression speed. The both are BSD licensed so that is not
> > an issue. The base code for lz4 is c and it is c++ for snappy. There
> > is also a HC (high-compression) varient for lz4 that pushes its compression
> > rate to about the same as zlib (-1) which uses the same decompressor which
> > can provide data even faster due to better compression. Some more real
> > world tests would be useful, which is really where being pluggable would
> > help.
> 
> Well, it's probably a good idea for us to test, during the development
> cycle, which algorithm works better for WAL compression, and then use
> that one.  Once we make that decision, I don't see that there are many
> circumstances in which a user would care to override it.  Now if we
> find that there ARE reasons for users to prefer different algorithms
> in different situations, that would be a good reason to make it
> configurable (or even pluggable).  But if we find that no such reasons
> exist, then we're better off avoiding burdening users with the need to
> configure a setting that has only one sensible value.
> 
> It seems fairly clear from previous discussions on this mailing list
> that snappy and lz4 are the top contenders for the position of
> "compression algorithm favored by PostgreSQL".  I am wondering,
> though, whether it wouldn't be better to add support for both - say we
> added both to libpgcommon, and perhaps we could consider moving pglz
> there as well.  That would allow easy access to all of those
> algorithms from both front-end and backend-code.  If we can make the
> APIs parallel, it should very simple to modify any code we add now to
> use a different algorithm than the one initially chosen if in the
> future we add algorithms to or remove algorithms from the list, or if
> one algorithm is shown to outperform another in some particular
> context.  I think we'll do well to isolate the question of adding
> support for these algorithms form the current patch or any other
> particular patch that may be on the table, and FWIW, I think having
> two leading contenders and adding support for both may have a variety
> of advantages over crowning a single victor.
> 
+++1

Ken



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Next
From: Andres Freund
Date:
Subject: Re: Sigh, my old HPUX box is totally broken by DSM patch