Re: Streaming base backups - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Streaming base backups
Date
Msg-id AANLkTimV2q4w0jEus_Mwyjp+=0w2syirOohnbYTer8zg@mail.gmail.com
Whole thread Raw
In response to Re: Streaming base backups  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Responses Re: Streaming base backups  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Re: Streaming base backups  (Cédric Villemain <cedric.villemain.debian@gmail.com>)
List pgsql-hackers
On Wed, Jan 5, 2011 at 22:58, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Attached is an updated streaming base backup patch, based off the work
>
> Thanks! :)
>
>> * Compression: Do we want to be able to compress the backups server-side? Or
>>   defer that to whenever we get compression in libpq? (you can still tunnel it
>>   through for example SSH to get compression if you want to) My thinking is
>>   defer it.
>
> Compression in libpq would be a nice way to solve it, later.

Yeah, I'm pretty much set on postponing that one.


>> * Compression: We could still implement compression of the tar files in
>>   pg_streamrecv (probably easier, possibly more useful?)
>
> What about pg_streamrecv | gzip > …, which has the big advantage of
> being friendly to *any* compression command line tool, whatever patents
> and licenses?

That's part of what I meant with "easier and more useful".

Right now though, pg_streamrecv will output one tar file for each
tablespace, so you can't get it on stdout. But that can be changed of
course. The easiest step 1 is to just use gzopen() from zlib on the
files and use the same code as now :-)


>> * Stefan mentiond it might be useful to put some
>> posix_fadvise(POSIX_FADV_DONTNEED)
>>   in the process that streams all the files out. Seems useful, as long as that
>>   doesn't kick them out of the cache *completely*, for other backends as well.
>>   Do we know if that is the case?
>
> Maybe have a look at pgfincore to only tag DONTNEED for blocks that are
> not already in SHM?

I think that's way more complex than we want to go here.


>> * include all the necessary WAL files in the backup. This way we could generate
>>   a tar file that would work on it's own - right now, you still need to set up
>>   log archiving (or use streaming repl) to get the remaining logfiles from the
>>   master. This is fine for replication setups, but not for backups.
>>   This would also require us to block recycling of WAL files during the backup,
>>   of course.
>
> Well, I would guess that if you're streaming the WAL files in parallel
> while the base backup is taken, then you're able to have it all without
> archiving setup, and the server could still recycling them.

Yes, this was mostly for the use-case of "getting a single tarfile
that you can actually use to restore from without needing the log
archive at all".

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Streaming base backups
Next
From: Dimitri Fontaine
Date:
Subject: Re: Visual Studio 2010/Windows SDK 7.1 support