Home > mailing lists

Re: design for parallel backup - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: design for parallel backup
Date	May 4, 2020 22:41:45
Msg-id	20200504194145.lw6c34moqsykxmfj@alap3.anarazel.de Whole thread Raw
In response to	Re: design for parallel backup (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

Hi,

On 2020-05-04 14:04:32 -0400, Robert Haas wrote:
> OK, thanks. Let me see if I can summarize here. On the strength of
> previous experience, you'll probably tell me that some parts of this
> summary are wildly wrong or at least "not quite correct" but I'm going
> to try my best.

> - Server-side compression seems like it has the potential to be a
> significant win by stretching bandwidth. We likely need to do it with
> 10+ parallel threads, at least for stronger compressors, but these
> might be threads within a single PostgreSQL process rather than
> multiple separate backends.

That seems right. I think it might be reasonable to just support
"compression parallelism" for zstd, as the library has all the code
internally. So we basically wouldn't have to care about it.

> - Client-side cache management -- that is, use of
> posix_fadvise(DONTNEED), posix_fallocate, and sync_file_range, where
> available -- looks like it can improve write rates and CPU efficiency
> significantly. Larger block sizes show a win when used together with
> such techniques.

Yea. Alternatively direct io, but I am not sure we want to go there for
now.

> - The benefits of multiple concurrent connections remain somewhat
> elusive. Peter Eisentraut hypothesized upthread that such an approach
> might be the most practical way forward for networks with a high
> bandwidth-delay product, and I hypothesized that such an approach
> might be beneficial when there are multiple tablespaces on independent
> disks, but we don't have clear experimental support for those
> propositions. Also, both your data and mine indicate that too much
> parallelism can lead to major regressions.

I think for that we'd basically have to create two high bandwidth nodes
across the pond. My experience in the somewhat recent past is that I
could saturate multi-gbit cross-atlantic links without too much trouble,
at least once I changed sys.net.ipv4.tcp_congestion_control to something
appropriate for such setups (BBR is probably the thing to use here these
days).

> - Any work we do while trying to make backup super-fast should also
> lend itself to super-fast restore, possibly including parallel
> restore.

I'm not sure I see a super clear case for parallel restore in any of the
experiments done so far. The only case we know it's a clear win is when
there's independent filesystems for parts of the data.  There's an
obvious case for parallel decompression however.

> Compressed tarfiles don't permit random access to member files.

This is an issue for selective restores too, not just parallel
restore. I'm not sure how important a case that is, although it'd
certainly be useful if e.g. pg_rewind could read from compressed base
backups.

> Uncompressed tarfiles do, but software that works this way is not
> commonplace.

I am not 100% sure which part you comment on not being commonplace
here. Supporting randomly accessing data in tarfiles?

My understanding of that is that one still has to "skip" through the
entire archive, right? What not being compressed allows is to not have
to read the files inbetween. Given the size of our data files compared
to the metadata size that's probably fine?

> The only mainstream archive format that seems to support random access
> seems to be zip. Adopting that wouldn't be crazy, but might limit our
> choice of compression options more than we'd like.

I'm not sure that's *really* an issue - there's compression format codes
in zip ([1] 4.4.5, also 4.3.14.3 & 4.5 for another approach), and
several tools seem to have used that to add additional compression
methods.

> A tar file of individually compressed files might be a plausible
> alternative, though there would probably be some hit to compression
> ratios for small files.

I'm not entirely sure using zip over
uncompressed-tar-over-compressed-files gains us all that much. AFAIU zip
compresses each file individually. So the advantage would be a more
efficient (less seeking) storage of archive metadata (i.e. which file is
where) and that the metadata could be compressed.

> Then again, if a single, highly-efficient process can handle a
> server-to-client backup, maybe the same is true for extracting a
> compressed tarfile...

Yea. I'd expect that to be the case, at least for the single filesystem
case. Depending on the way multiple tablespaces / filesystems are
handled, it could even be doable to handle that reasonably - but it'd
probably be harder.

Greetings,

Andres Freund

[1] https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

pgsql-hackers by date:

From: Peter Eisentraut
Date: 04 May 2020, 21:57:14
Subject: Re: Unify drop-by-OID functions

From: David Kimura
Date: 04 May 2020, 23:39:36
Subject: Re: Avoiding hash join batch explosions with extreme skew and weird stats

Re: design for parallel backup - Mailing list pgsql-hackers

Previous

Next