Re: Include WAL in base backup - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Include WAL in base backup
Date
Msg-id AANLkTikgrqwjyhPCFtKCX_NqOi_4YWcDOJQNDwhojOGC@mail.gmail.com
Whole thread Raw
In response to Re: Include WAL in base backup  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Include WAL in base backup  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Jan 20, 2011 at 05:03, Stephen Frost <sfrost@snowman.net> wrote:
> Greetings,
>
> * Magnus Hagander (magnus@hagander.net) wrote:
>> For now, you need to set wal_keep_segments to make it work properly,
>> but if you do the idea is that the tar file/stream generated in the
>> base backup will include all the required WAL files.
>
> Is there some reason to not ERROR outright if we're asked to provide WAL
> and wal_keep_segments isn't set..?  I'd rather do that than only ERROR
> when a particular WAL is missing..  That could lead to transient backup
> errors that an inexperienced sysadmin or DBA might miss or ignore.
> They'll notice if it doesn't work the first time they try it and spits
> out a hint about wal_keep_segments.

Well, in a "smaller:ish" database you can easily do the full backup
before you run out of segments in the data directory even when you
haven't set wal_keep_segments. If we error out, we force extra work on
the user in the trivial case.

I'd rather not change that, but instead (as Fujii-san has also
mentioned is needed anyway) put some more effort into documenting in
which cases you need to set it.


>> I've got some refactoring I want to do around the
>> SendBackupDirectory() function after this, but a review of the
>> functionality first would be good. And obviously, documentation is
>> still necessary.
>
> mkay, I'm not going to try to make this ready for committer, but will
> provide my comments on it overall.  Bit difficult to review when someone
> else is reviewing the base patch too. :/

Heh, yeah.


> Here goes:
>
> - I'm not a huge fan of the whole 'closetar' option, that feels really
>  rather wrong to me.  Why not just open it and close it in
>  perform_base_backup(), unconditionally?

Yeah, we could move the whole thing up there. Or, as I mentioned in an
IM conversation with Heikki, just get rid of SendBackupDirectory()
completely and inline it inside the loop in perform_base_backup().
Given that it's basically just 5 lines + a call to sendDir()..


> - I wonder if you're not getting to a level where you shold be using a
>  struct to pass the relevant information to perform_base_backup()
>  instead of adding more arguments on..  That's going to get unwieldy at
>  some point.

Yeah, probably.

We *could* pass the BaseBackupCmd struct from the parser all the way
in - or is that cheating too much on abstractions? It seems if we
don't, we're just going to hav ea copy of that struct without the
NodeTag member..


> - Why not initialize logid and logseg like so?:
>
>        int logid = startptr.xlogid;
>        int logseg = startptr.xrecoff / XLogSegSize;
>
>  Then use those in your elog?  Seems cleaner to me.

Hmm. Yes. Agreed.


> - A #define right in the middle of a while loop...?  Really?

Haha, yeah, that was a typo. I didn't remember the name of the
variable so I stuck it there for testing and forgot it. It should be
ThisTimeLineID, and no #define at all.


> - The grammar changes strike me as..  odd.  Typically, you would have an
>  'option' production that you can then have a list of and then let each
>  option be whatever the OR'd set of options is.  Wouldn't the current
>  grammar require that you put the options in a specific order?  That'd
>  blow.

It does require them in a specific order. The advantage is that it
makes the code easier. and it's not like end users are expected to run
them anyway...

Now, I'm no bison export, so it might be an easy fix. But the way I
could figure, to make them order indepdent I have to basically collect
them up together as a List instead of just as a struct, and then loop
through that list to build a struct later.

If someone who knows Bison better can tell me a neater way to do that,
I'll be happy to change :-)


> @@ -687,7 +690,7 @@ BaseBackup()
>                 * once since it can be relocated, and it will be checked before we do
>                 * anything anyway.
>                 */
> -               if (basedir != NULL && i > 0)
> +               if (basedir != NULL && !PQgetisnull(res, i, 1))
>                        verify_dir_is_empty_or_create(PQgetvalue(res, i, 1));
>        }
>
> - Should the 'i > 0' conditional above still be there..?

No. That's a cheat-check that assumes the base directory is always
sent first. Which is not true anymore - with this patch we always send
it *last* so we can include the WAL in it.


> So, that's my review from just reading the source code and the thread..
> Hope it's useful, sorry it's not more. :/

Thanks - it certainly is!

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: pg_basebackup for streaming base backups
Next
From: Magnus Hagander
Date:
Subject: Re: pg_basebackup for streaming base backups