Re: backup manifests - Mailing list pgsql-hackers

From David Steele
Subject Re: backup manifests
Date
Msg-id 2cec49b2-eeab-574f-2b5e-1f437b2cbec5@pgmasters.net
Whole thread Raw
In response to Re: backup manifests  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 9/20/19 10:59 AM, Robert Haas wrote:
> On Fri, Sep 20, 2019 at 9:46 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> - appendStringInfo et. al. I don't think it would be that hard to move
>> this to src/common, but I'm also not sure it really solves the
>> problem, because StringInfo has a 1GB limit, and there's no rule at
>> all that a backup manifest has got to be less than 1GB.
> 
> Hmm.  That's actually going to be a problem on the server side, no
> matter what we do on the client side.  We have to send the manifest
> after we send everything else, so that we know what we sent. But if we
> sent a lot of files, the manifest might be really huge. I had been
> thinking that we would generate the manifest on the server and send it
> to the client after everything else, but maybe this is an argument for
> generating the manifest on the client side and writing it
> incrementally. That would require the client to peek at the contents
> of every tar file it receives all the time, which it currently doesn't
> need to do, but it does peek inside them a little bit, so maybe it's
> OK.
> 
> Another alternative would be to have the server spill the manifest in
> progress to a temp file and then stream it from there to the client.

This seems reasonable to me.

We keep an in-memory representation which is just an array of structs
and is fairly compact -- 1 million files uses ~150MB of memory.  We just
format and stream this to storage when saving.  Saving is easier than
loading, of course.

-- 
-David
david@pgmasters.net



pgsql-hackers by date:

Previous
From: Luis Carril
Date:
Subject: Re: Option to dump foreign data in pg_dump
Next
From: Chapman Flack
Date:
Subject: Re: backup manifests