Re: Include WAL in base backup - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Include WAL in base backup
Date
Msg-id AANLkTinXNP5fjbE+PUFoVfn_iM+jN7DxaE-hKxHDEbpJ@mail.gmail.com
Whole thread Raw
In response to Re: Include WAL in base backup  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Include WAL in base backup  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers
On Tue, Jan 25, 2011 at 10:56, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Tue, Jan 25, 2011 at 12:33 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> Here's an updated version of the patch:
>>
>> * rebased on the current git master (including changing the switch
>> from -w to -x since -w was used now)
>> * includes some documentation
>> * use XLogByteToSeg and uint32 as mentioned here
>> * refactored to remove SendBackupDirectory(), removing the ugly closetar boolean
>
> I reviewed the patch. Here are comments.
>
>
> +               {"xlog", no_argument, NULL, 'w'},
>
> Typo: s/w/x

Ah, oops. thanks.

> /*
>  * BASE_BACKUP [LABEL <label>] [PROGRESS] [FAST]
>  */
>
> In repl_gram.y, the above comment needs to be updated.

Same here - oops, thanks. It was also missing the quotes around <label>.


> Both this patch and the multiple-backup one removes SendBackupDirectory
> in the almost same way. So, how about committing that common part first?

I think they're small enough that we'll just commit it along with
whichever comes first and then have the other one merge with it.

> +               XLByteToSeg(endptr, endlogid, endlogseg);
>
> XLByteToPrevSeg should be used for the backup end location. Because
> when it's a boundary byte, all the required WAL records exist in the
> previous WAL segment. This is why pg_stop_backup uses XLByteToPrevSeg
> for the backup end location.

Well, it's just for debugging output, really, but see below.


> +               elog(DEBUG1, "Going to send wal from %i.%i to %i.%i",
> +                        logid, logseg,
> +                        endlogid, endlogseg);
>
> %u should be used instead of %i. Or how about logging the filename?

I actually plan to remove that DEBUG output completely (sorry, forgot
to mention that), I'm not sure it's useful to have it around once we
apply. Or do you think it would be useful to keep?


> +                       char    xlogname[64];
>
> How about using MAXFNAMELEN instead of 64?
>

Agreed.


> +                       XLogFileName(xlogname, ThisTimeLineID, logid, logseg);
> +                       sprintf(fn, "./pg_xlog/%s", xlogname);
>
> How about using XLogFilePath? instead?

Can't do that, because we need the "./" part when we call sendFile() -
it's structured around having that one, since we get it from the file
name looping.


> +                       if (logid > endptr.xlogid ||
> +                               (logid == endptr.xlogid && logseg >= endptr.xrecoff / XLogSegSize))
>
> Why don't you use endlogseg?

Honestly? Because I thought I added endlogseg just for the debugging
output, didn't realize I had it down here.

But if I do, then I should not use the XLByteToPrevSeg() per your
suggestion above, right? Keep it the way it is.

> The estimated size of $PGDATA doesn't include WAL files, but the
> actual one does.
> This inconsistency messes up the progress report as follows:
>
>    33708/17323 kB (194%) 1/1 tablespaces

Yeah, that is definitely a potential problem. I think we're just going
to have to document it - and in a big database, it's not going to be
quite as bad...


> So, what about sparating the progress report into two parts: one for $PGDATA and
> tablespaces, another for WAL files?

We can't really do that. We need to send the progress report before we
begin the tar file, and we don't know how many xlog segments we will
have at that time. And we don't want to send pg_xlog as a separate tar
stream, because if we do we won't be able to get the single-tar-output
in the simple case - thus no piping etc. I actually had the xlog data
as it's own tar stream in the beginning, but Heikki convinced me of
the advantage of keeping it in the main one...

What we could do, I guess, is to code pg_basebackup to detect when it
starts receiving xlog files and then stop increasing the percentage at
that point, instead just doing a counter?

(the discussed changse above have been applied and pushed to github)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


pgsql-hackers by date:

Previous
From: Xiaobo Gu
Date:
Subject: Re: Is there a way to build PostgreSQL client libraries with MinGW
Next
From: Xiaobo Gu
Date:
Subject: Re: Is there a way to build PostgreSQL client libraries with MinGW