Re: PITR Base Backup on an idle 8.1 server - Mailing list pgsql-general

From Marco Colombo
Subject Re: PITR Base Backup on an idle 8.1 server
Date
Msg-id 4663EF88.5010700@esiway.net
Whole thread Raw
In response to Re: PITR Base Backup on an idle 8.1 server  (Greg Smith <gsmith@gregsmith.com>)
Responses Re: PITR Base Backup on an idle 8.1 server  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-general
Greg Smith wrote:
> The way you're grabbing
> files directly from the xlog directory only works because your commit
> workload is so trivial that you can get away with it, and because you
> haven't then tried to apply future archive logs.

Well, it's only because I don't need future logs, just like I don't need
"future" files. Backup is at 2:00 AM, any change after that is
potentially lost. That includes e-mails, web contents, and database
contents. The database contents are in no way different to us.

It's the "your commit workload is so trivial that you can get away with
it" I don't really get, but more on this later.

> In the general case,
> circumventing the archiving when the backup is going on won't guarantee
> everything is ordered just right for PITR to work correctly.

Generic PITR? You mean if backup is at 2:00 AM and the server crashes
(all disks lost) at 2:00 PM, you want to be able to recover to some
time like 11:00 AM, and be precise about it? That's PITR to me - and the
"precise" part is key here... either the time or the transaction ID
would do, the point is being able to draw a line and say "anything
before this is correct".

Well if that's what you mean by PITR, I never claimed my method would
give you that ability. I'm pretty aware it won't do, in the general
case. If you need that, you need to archive all the logs created after
the backup, that's pretty obvious.

But even under heavy write load, my method works, if the only point in
time you want to be able to recover is 2:00AM.

It works for you too, it gives you nice working backup. If you also need
real PITR, your archive_commmand is going to be something like:

archive_command = 'test ! -f /var/lib/pgsql/backup_lock && cp %p
/my_archive_dir/%f'

> I consider
> what you're doing a bad idea that you happen to be comfortable with the
> ramifications of, and given the circumstances I understand how you have
> ended up with that solution.
>
> I would highly recommend you consider switching at some point to the
> solution Simon threw out:
>
>> create table xlog_switch as
>> select '0123456789ABCDE' from generate_series(1,1000000);
>> drop table xlog_switch;

Ok, now the segment gets rotated, and a copy of the file appears
somewhere. What's the difference in having the archive_command store it
or your backup procedure store it?

Let's say my archive_command it's a cp to another directory, and let's
say step 5) is a cp too. What exaclty buys me to force a segment switch
with dummy data instead of doing a cp myself on the real segment data?

I mean, both ways would do.

> you should reconsider doing your PITR backup
> properly--where you never touch anything in the xlog directory and
> instead only work with what the archive_command is told.

Well, I'm copying files. That's exaclty what a typical archive_command
does. It's no special in any way, just a cp (or tar or rsync or
whatever). Unless you mean I'm not supposed to copy a partially filled
segment. There can be only one, the others would be full ones, and full
ones are no problem. I think PG correctly handles the partial one if I
drop it in pg_xlog at recover time.

That segment you need to treat specially at recover time, if you use my
procedure (in my case, I don't). If you have a later copy if it (most
likely an archived one), you have to make it avalable to PG instead of
the old one, if you want to make use of the rest of the archived
segments. If you don't want to care about this, then I agree your method
of forcing a segment switch is simpler. There's not partial segment at
all. Anyway, it's running a "psql -c" at backup time vs. a "test -nt &&
rm" at restore time, not a big deal in either case.

.TM.

pgsql-general by date:

Previous
From: Alban Hertroys
Date:
Subject: Re: why postgresql over other RDBMS
Next
From: Ranieri Mazili
Date:
Subject: Jumping Weekends