Re: Allowing multiple concurrent base backups - Mailing list pgsql-hackers

From Aidan Van Dyk
Subject Re: Allowing multiple concurrent base backups
Date
Msg-id AANLkTi=7_CkbuMVGzkOyOqjyp90Pn2XkMk9_r4xM3-LM@mail.gmail.com
Whole thread Raw
In response to Re: Allowing multiple concurrent base backups  (David Fetter <david@fetter.org>)
List pgsql-hackers
On Wed, Jan 12, 2011 at 10:15 AM, David Fetter <david@fetter.org> wrote:

>> Considering that parallell base backups would be io-bound (or
>> network-bound), there is little need to actually run them in parallell
>
> That's not actually true.  Backups at the moment are CPU-bound, and
> running them in parallel is one way to make them closer to I/O-bound,
> which is what they *should* be.

Remember, we're talking about filesystem base backups here.  If you're
CPU can't handle a stream from disk -> network, byte for byte (maybe
encrypting it), then you've spend *WAAAAY* to much on your storage
sub-system, and way to little on CPU.

I can see trying to "parallize" the base backup such that each
table-space could be run concurrently, but that's about it.

> There are other proposals out there, and some work being done, to make
> backups less dependent on CPU, among them:
>
> - Making the on-disk representation smaller
> - Making COPY more efficient
>
> As far as I know, none of this work is public yet.

pg_dump is another story.  But it's not related to base backups for
PIT Recovery/Replication.

a.

--
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.


pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: Allowing multiple concurrent base backups
Next
From: Tom Lane
Date:
Subject: Re: Allowing multiple concurrent base backups