Home > mailing lists

Re: Would it be possible to have parallel archiving? - Mailing list pgsql-hackers

From	Stephen Frost
Subject	Re: Would it be possible to have parallel archiving?
Date	August 28, 2018 23:08:35
Msg-id	20180828170834.GG3326@tamriel.snowman.net Whole thread Raw
In response to	Re: Would it be possible to have parallel archiving? (David Steele <david@pgmasters.net>)
Responses	Re: Would it be possible to have parallel archiving?
List	pgsql-hackers

Tree view

Greetings,

* David Steele (david@pgmasters.net) wrote:
> On 8/28/18 8:32 AM, Stephen Frost wrote:
> >
> > * hubert depesz lubaczewski (depesz@depesz.com) wrote:
> >> I'm in a situation where we quite often generate more WAL than we can
> >> archive. The thing is - archiving takes long(ish) time but it's
> >> multi-step process and includes talking to remote servers over network.
> >>
> >> I tested that simply by running archiving in parallel I can easily get
> >> 2-3 times higher throughput.
> >>
> >> But - I'd prefer to keep postgresql knowing what is archived, and what
> >> not, so I can't do the parallelization on my own.
> >>
> >> So, the question is: is it technically possible to have parallel
> >> archivization, and would anyone be willing to work on it (sorry, my
> >> c skills are basically none, so I can't realistically hack it myself)
> >
> > Not entirely sure what the concern is around "postgresql knowing what is
> > archived", but pgbackrest already does exactly this parallel archiving
> > for environments where the WAL volume is larger than a single thread can
> > handle, and we've been rewriting it in C specifically to make it fast
> > enough to be able to keep PG up-to-date regarding what's been pushed
> > already.
>
> To be clear, pgBackRest uses the .ready files in archive_status to
> parallelize archiving but still notifies PostgreSQL of completion via
> the archive_command mechanism.  We do not modify .ready files to .done
> directly.

Right, we don't recommend mucking around with that directory of files.
Even if that works today (which you'd need to test extensively...),
there's no guarantee that it'll work and do what you want in the
future...

> However, we have optimized the C code to provide ~200
> notifications/second (3.2GB/s of WAL transfer) which is enough to keep
> up with the workloads we have seen.  Larger WAL segment sizes in PG11
> will theoretically increase this to 200GB/s, though in practice CPU to
> do the compression will become a major bottleneck, not to mention
> network, etc.

Agreed.

Thanks!

Stephen

Attachment

signature.asc

pgsql-hackers by date:

From: "Daniel Verite"
Date: 28 August 2018, 22:45:25
Subject: Re: csv format for psql

From: Ashutosh Bapat
Date: 28 August 2018, 23:13:26
Subject: Re: TupleTableSlot abstraction

Re: Would it be possible to have parallel archiving? - Mailing list pgsql-hackers

Attachment

Previous

Next