Re: design for parallel backup - Mailing list pgsql-hackers

From Andres Freund
Subject Re: design for parallel backup
Date
Msg-id 20200422190324.brwmxqzcxv7xvhbb@alap3.anarazel.de
Whole thread Raw
In response to Re: design for parallel backup  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: design for parallel backup  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2020-04-22 14:40:17 -0400, Robert Haas wrote:
> > Oh? I find it *extremely* exciting here. This is pretty close to the
> > worst case compressability-wise, and zstd takes only ~22% of the time as
> > gzip does, while still delivering better compression.  A nearly 5x
> > improvement in compression times seems pretty exciting to me.
> >
> > Or do you mean for zstd over lz4, rather than anything over gzip?  1.8x
> > -> 2.3x is a pretty decent improvement still, no? And being able to do
> > do it in 1/3 of the wall time seems pretty helpful.
> 
> I meant the latter thing, not the former. I'm taking it as given that
> we don't want gzip as the only option. Yes, 1.8x -> 2.3x is decent,
> but not as earth-shattering as 8.8x -> ~24x.

Ah, good.


> In any case, I lean towards adding both lz4 and zstd as options, so I
> guess we're not really disagreeing here

We're agreeing, indeed ;)


> > I agree we should pick one. I think tar is not a great choice. .zip
> > seems like it'd be a significant improvement - but not necessarily
> > optimal.
> 
> Other ideas?

The 7zip format, perhaps. Does have format level support to address what
we were discussing earlier: "Support for solid compression, where
multiple files of like type are compressed within a single stream, in
order to exploit the combined redundancy inherent in similar files.".

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Jehan-Guillaume de Rorthais
Date:
Subject: Re: Remove non-fast promotion Re: Should we remove a fallbackpromotion? take 2
Next
From: Alvaro Herrera
Date:
Subject: Re: 2pc leaks fds