Thread: PITR Base Backup on an idle 8.1 server

PITR Base Backup on an idle 8.1 server

From
Greg Smith
Date:
I'm trying to figure out the best way to cope with creating a PITR base
backup on a 8.1 server that is essentially idle during that time (and for
hours afterwards).  Because there's no activity when the backup is going
on, I get the same segment file for FIRST WAL and LAST WAL.
Unfortunately, that segment doesn't get archived within an acceptable
period of time.  The "correct" approach of having a script that waits for
the LAST WAL to appear waits for far longer than the things dependant on
the backup can afford to linger around.

Obviously this problem goes away with the archive_timeout in 8.2, but I'm
curious what clever workarounds people might be using with existing 8.1
servers in this sort of situation.  Things I'm considering so far:

-Touch the expected archive file to block the archive_command, create the
file using the partial segment, finish the backup, then wipe the partial
so regular archiving will get to it eventually.  I dislike programming
around the race conditions implicit here, particularly in something that's
otherwise happily implemented in a simple shell script.

-Find something harmless I can execute in a loop that will generate WAL
activity, run that until the segment gets archived.  Haven't really
thought of something good to use for that purpose yet.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: PITR Base Backup on an idle 8.1 server

From
Marco Colombo
Date:
Greg Smith wrote:
[...]
> -Find something harmless I can execute in a loop that will generate WAL
> activity, run that until the segment gets archived.  Haven't really
> thought of something good to use for that purpose yet.

Some time ago I started a thread about taking on-the-fly backups at file
level on idle servers. Problem was much the same of yours. After posting
that, I'm doing some research on my own (in spare time) now.

Currently, I'm using the following procedure:

1) create a backup lockfile
2) issue pg_start_backup()
3) tar the data directory, excluding pg_xlog
4) issue pg_stop_backup()
5) tar pg_xlog
6) remove the lockfile

Meanwhile, a "fake" WAL archiving is active, which does pretty nothing.

archive_command = 'test ! -f /var/lib/pgsql/backup_lock </dev/null'

Under normal condition (no backup running) this will trick PG into
thinking that segments get archived. If I'm not mistaken, PG should
behave exactly as if no archive_command is configured, and recycle them
ASAP. This saves me the burden of taking care of the archiving at all.

Should a WAL segment fill up during the backup (unlikely as it is, since
the system is mostly idle AND the tar completes withing a minute - but
it's still possible), the test command would report failure in archiving
the segment, and PG would keep it around in pg_xlog, ready to be tar'ed
  at step 5 (mind you - this is speculation since I had no time to
actually test it).

So it ends up with two tar archives: one is the datafiles backup, the
other the wal segments. As an optimization, I should exclude WAL
segments older that the lockfile in step 5), since I know they are older
than the backup.

What I really should do now is kill -STOP the tar at step 3), start some
big write activity and see what exaclty happens to the WAL segment when
it fills up and PG tries to archive it.

Restore would be done the usual way, extracting both the archives, maybe
adding WAL segments from the crashed pg_xlog. Whether I need to
configure a fake restore command I have still to find out.

Hope it helps,
.TM.

Re: PITR Base Backup on an idle 8.1 server

From
Greg Smith
Date:
On Thu, 31 May 2007, Marco Colombo wrote:

> archive_command = 'test ! -f /var/lib/pgsql/backup_lock </dev/null'
> Under normal condition (no backup running) this will trick PG into thinking
> that segments get archived. If I'm not mistaken, PG should behave exactly as
> if no archive_command is configured, and recycle them ASAP.

That's correct.  I don't think you even need the </dev/null in that
command.

> Should a WAL segment fill up during the backup (unlikely as it is, since the
> system is mostly idle AND the tar completes withing a minute - but it's still
> possible), the test command would report failure in archiving the segment,
> and PG would keep it around in pg_xlog, ready to be tar'ed  at step 5 (mind
> you - this is speculation since I had no time to actually test it).

That's also correct.  What you're doing will work for getting a useful
backup.

However, recognize the limitations of the approach:  this is a clever way
to make a file-system level snapshot of your database without involving
the archive logging process.  You'll get a good backup at that point, but
it won't provide you with any ability to do roll-forward recovery if the
database gets screwed up in the middle of the day.  Since that's a
requirement of most PITR setups, I'm not sure your workaround accomplishes
what you really want.  More on why that is below.

> Restore would be done the usual way, extracting both the archives, maybe
> adding WAL segments from the crashed pg_xlog. Whether I need to configure a
> fake restore command I have still to find out.

This won't work, and resolving it will require going to grips with the
full archive logging mechanism rather than working around it the way you
suggest above.

Every time the server hits a checkpoint, it recycles old WAL
segments--renames them and then overwrites them with new data.  The first
time your database hits a checkpoint after your backup is done, you will
have lost segment files such that it's impossible to recover the current
state of the database anymore.  You'll have the first part of the series
(from the base backup), the last ones (from the current pg_xlog), but will
be missing some number in the middle (the recycled files).

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: PITR Base Backup on an idle 8.1 server

From
"Simon Riggs"
Date:
On Wed, 2007-05-30 at 22:41 -0400, Greg Smith wrote:

> -Find something harmless I can execute in a loop that will generate WAL
> activity, run that until the segment gets archived.  Haven't really
> thought of something good to use for that purpose yet.

create table xlog_switch as
select '0123456789ABCDE' from generate_series(1,1000000);
drop table xlog_switch;

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: PITR Base Backup on an idle 8.1 server

From
Marco Colombo
Date:
Greg Smith wrote:
> On Thu, 31 May 2007, Marco Colombo wrote:
>
>> archive_command = 'test ! -f /var/lib/pgsql/backup_lock </dev/null'
>> Under normal condition (no backup running) this will trick PG into
>> thinking that segments get archived. If I'm not mistaken, PG should
>> behave exactly as if no archive_command is configured, and recycle
>> them ASAP.
>
> That's correct.  I don't think you even need the </dev/null in that
> command.

Ok, thanks. I've seen that </dev/null somewhere in the docs, and blindly
copied it.

>> Should a WAL segment fill up during the backup (unlikely as it is,
>> since the system is mostly idle AND the tar completes withing a minute
>> - but it's still possible), the test command would report failure in
>> archiving the segment, and PG would keep it around in pg_xlog, ready
>> to be tar'ed  at step 5 (mind you - this is speculation since I had no
>> time to actually test it).
>
> That's also correct.  What you're doing will work for getting a useful
> backup.

Great, that's all I need.

> However, recognize the limitations of the approach:  this is a clever
> way to make a file-system level snapshot of your database without
> involving the archive logging process.  You'll get a good backup at that
> point, but it won't provide you with any ability to do roll-forward
> recovery if the database gets screwed up in the middle of the day.
> Since that's a requirement of most PITR setups, I'm not sure your
> workaround accomplishes what you really want.  More on why that is below.

Here's the original thread I started.

http://archives.postgresql.org/pgsql-general/2007-05/msg00673.php

Briefly, I don't need PITR proper, it may be even harmful in my case.
The data on the db may be tied to the data on the filesystem in ways
unknown to me... think of some kind of custom CMS. I'm able to restore
.html, .php, .png or whatever files as they were at backup time (say,
2:00AM). All I need to do with PG backups is restoring db contents at
the same time (almost). The only point in time I'm interested in is
backup time, so to say.

>> Restore would be done the usual way, extracting both the archives,
>> maybe adding WAL segments from the crashed pg_xlog. Whether I need to
>> configure a fake restore command I have still to find out.
>
> This won't work, and resolving it will require going to grips with the
> full archive logging mechanism rather than working around it the way you
> suggest above.

This is interesting. Why won't it work exactly? Let's say I trick PG in
thinking it's a recover from backup+archived wal. It'll find all
segments it needs (and no more) already in pg_xlog. I expect it to just
use them. Maybe I'd need to configure /bin/false as restore_command. Or
maybe just something like 'test -f /var/lib/pgsql/data/pg_xlog/%f' (true
if the file is already there). I'll have to experiment, but I don't see
any major problem right now. The files are already there.

> Every time the server hits a checkpoint, it recycles old WAL
> segments--renames them and then overwrites them with new data.  The
> first time your database hits a checkpoint after your backup is done,
> you will have lost segment files such that it's impossible to recover
> the current state of the database anymore.  You'll have the first part
> of the series (from the base backup), the last ones (from the current
> pg_xlog), but will be missing some number in the middle (the recycled
> files).

Sure, now I see what you mean, but I was under the assumption of very
low database activity, in may case, it'a about 2 wal segments/day. I
usually see files in my pg_xlog that are 2 days old, so there won't be
any missing segments. And anyway, the ability to recover at some time
after the backup is just a plus. I don't need it. In case of a complete
crash, I'm going to restore the whole system as it was at backup time.
And if only the PG datadir gets corrupted later, and I want to try and
recover it as it was at that later time, still I have a 99% chance of
being able to do so, due to very low write activity. And if that fails,
because of some uncommon write activity right at that inconvenient time,
I can just fall back to the case of a complete system crash. The chances
of that happing are possibly lower of those of a system crash, so I'm
not worried about it.

I think that all we want is a backup that is immediately usable, w/o
waiting for the WAL segment it relies on to be archived. That is, if
taken at 2:00AM, it may be used to recover a crash at 2:10AM (assuming
the backup process ended by that time, of course).

If you need *both* a "full backup" *and* PITR, just add a real cp to the
archive_command above. The important part is to return failure during
the backup process, I think.

.TM.

Re: PITR Base Backup on an idle 8.1 server

From
Greg Smith
Date:
On Fri, 1 Jun 2007, Marco Colombo wrote:

> If you need *both* a "full backup" *and* PITR, just add a real cp to the
> archive_command above. The important part is to return failure during the
> backup process, I think.

You seem to have worked out a way for your application to do a base backup
in a fashion that you're happy with.  The way you're grabbing files
directly from the xlog directory only works because your commit workload
is so trivial that you can get away with it, and because you haven't then
tried to apply future archive logs.  In the general case, circumventing
the archiving when the backup is going on won't guarantee everything is
ordered just right for PITR to work correctly.  I consider what you're
doing a bad idea that you happen to be comfortable with the ramifications
of, and given the circumstances I understand how you have ended up with
that solution.

I would highly recommend you consider switching at some point to the
solution Simon threw out:

> create table xlog_switch as
> select '0123456789ABCDE' from generate_series(1,1000000);
> drop table xlog_switch;

as the best way to solve this problem (thanks, Simon; that's what I was
looking for but didn't quite have figured out yet).  With that as a
working way to force a segment change on an idle server without actually
impacting the data, you should reconsider doing your PITR backup
properly--where you never touch anything in the xlog directory and instead
only work with what the archive_command is told.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: PITR Base Backup on an idle 8.1 server

From
Marco Colombo
Date:
Greg Smith wrote:
> The way you're grabbing
> files directly from the xlog directory only works because your commit
> workload is so trivial that you can get away with it, and because you
> haven't then tried to apply future archive logs.

Well, it's only because I don't need future logs, just like I don't need
"future" files. Backup is at 2:00 AM, any change after that is
potentially lost. That includes e-mails, web contents, and database
contents. The database contents are in no way different to us.

It's the "your commit workload is so trivial that you can get away with
it" I don't really get, but more on this later.

> In the general case,
> circumventing the archiving when the backup is going on won't guarantee
> everything is ordered just right for PITR to work correctly.

Generic PITR? You mean if backup is at 2:00 AM and the server crashes
(all disks lost) at 2:00 PM, you want to be able to recover to some
time like 11:00 AM, and be precise about it? That's PITR to me - and the
"precise" part is key here... either the time or the transaction ID
would do, the point is being able to draw a line and say "anything
before this is correct".

Well if that's what you mean by PITR, I never claimed my method would
give you that ability. I'm pretty aware it won't do, in the general
case. If you need that, you need to archive all the logs created after
the backup, that's pretty obvious.

But even under heavy write load, my method works, if the only point in
time you want to be able to recover is 2:00AM.

It works for you too, it gives you nice working backup. If you also need
real PITR, your archive_commmand is going to be something like:

archive_command = 'test ! -f /var/lib/pgsql/backup_lock && cp %p
/my_archive_dir/%f'

> I consider
> what you're doing a bad idea that you happen to be comfortable with the
> ramifications of, and given the circumstances I understand how you have
> ended up with that solution.
>
> I would highly recommend you consider switching at some point to the
> solution Simon threw out:
>
>> create table xlog_switch as
>> select '0123456789ABCDE' from generate_series(1,1000000);
>> drop table xlog_switch;

Ok, now the segment gets rotated, and a copy of the file appears
somewhere. What's the difference in having the archive_command store it
or your backup procedure store it?

Let's say my archive_command it's a cp to another directory, and let's
say step 5) is a cp too. What exaclty buys me to force a segment switch
with dummy data instead of doing a cp myself on the real segment data?

I mean, both ways would do.

> you should reconsider doing your PITR backup
> properly--where you never touch anything in the xlog directory and
> instead only work with what the archive_command is told.

Well, I'm copying files. That's exaclty what a typical archive_command
does. It's no special in any way, just a cp (or tar or rsync or
whatever). Unless you mean I'm not supposed to copy a partially filled
segment. There can be only one, the others would be full ones, and full
ones are no problem. I think PG correctly handles the partial one if I
drop it in pg_xlog at recover time.

That segment you need to treat specially at recover time, if you use my
procedure (in my case, I don't). If you have a later copy if it (most
likely an archived one), you have to make it avalable to PG instead of
the old one, if you want to make use of the rest of the archived
segments. If you don't want to care about this, then I agree your method
of forcing a segment switch is simpler. There's not partial segment at
all. Anyway, it's running a "psql -c" at backup time vs. a "test -nt &&
rm" at restore time, not a big deal in either case.

.TM.

Re: PITR Base Backup on an idle 8.1 server

From
"Simon Riggs"
Date:
On Mon, 2007-06-04 at 12:55 +0200, Marco Colombo wrote:
> Greg Smith wrote:
> > The way you're grabbing
> > files directly from the xlog directory only works because your commit
> > workload is so trivial that you can get away with it, and because you
> > haven't then tried to apply future archive logs.
>
> Well, it's only because I don't need future logs, just like I don't need
> "future" files. Backup is at 2:00 AM, any change after that is
> potentially lost. That includes e-mails, web contents, and database
> contents. The database contents are in no way different to us.
>
> It's the "your commit workload is so trivial that you can get away with
> it" I don't really get, but more on this later.
>
> > In the general case,
> > circumventing the archiving when the backup is going on won't guarantee
> > everything is ordered just right for PITR to work correctly.
>
> Generic PITR? You mean if backup is at 2:00 AM and the server crashes
> (all disks lost) at 2:00 PM, you want to be able to recover to some
> time like 11:00 AM, and be precise about it? That's PITR to me - and the
> "precise" part is key here... either the time or the transaction ID
> would do, the point is being able to draw a line and say "anything
> before this is correct".

> my method

...is dangerous and anyone reading this thread would be well advised to
read the manual in full rather than treating this as a new and clever
technique. I'm adding this as a footnote so that the archives are clear
on this point, so we don't get loads of new DBAs picking up this idea
but missing the exact point of danger.

Making the assumption that its OK to archive WAL files in the pg_xlog
directory exposes you to the risk of having them deleted by the
archiver, which will invalidate your backup. That might not happen all
of the time, but I'm willing to bet that the time you need it is the
time it didn't work for you. Even if this doesn't effect you, it might
effect others, so I want to be certain to stamp this out before the fire
spreads.

You can still do the lock file test using a safe method. I'll document
that idea so we can steer people in the right direction.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: PITR Base Backup on an idle 8.1 server

From
Marco Colombo
Date:
Simon Riggs wrote:
 > Marco Colombo wrote:
>> my method
>
> ...is dangerous

Ok, but why? Once again, I'm asking: what _exactly_ can go wrong?

 > so we don't get loads of new DBAs picking up this idea
but missing the exact point of danger.

I'm one of them. I'm _am_ missing the exact point of danger.

 > Making the assumption that its OK to archive WAL files in the pg_xlog
                                     ^^^^^^^^^^
 > directory exposes you to the risk of having them deleted by the
 > archiver, which will invalidate your backup.
   ^^^^^^^^

I'm sorry I'm really having a hard time following you here... what is
"to archive" and "the archiver"? The archive_command? The tar in the
backup procedure? What do you mean by "deleted"? AFAIK, files in pg_xlog
are first renamed (and only if and after the archive_command returned
true) and later overwritten to. Never deleted. Anyway, how could that
invalidate the backup? It's all about making a self-contained backup.
What happens after that, it's irrelevant.

Hey, I haven't come here proposing a new revolutionary way to perform
backups! I've made pretty clear it was for a not-so-common case. And
anyway, I've just asked what may be wrong with my procedure, since it
seems to fit _my_ needs and it makes _my_ life simpler, and _I_ don't
see any flaw in it. It may be useful to others, _if_ it's correct. If
not, I'd like to know why.

Can you provide a simple failure scenario, please? That would help me
understand what I'm missing...

.TM.


Re: PITR Base Backup on an idle 8.1 server

From
"Simon Riggs"
Date:
On Tue, 2007-06-05 at 18:39 +0200, Marco Colombo wrote:
> I'm asking: what _exactly_ can go wrong?

If a checkpoint occurs while taking the backup then the contents of the
files will be overwritten and you will be unable to rollforward from
before the backup until after the backup. This will give you the FATAL
error message "WAL ends before end time of backup dump". You won't know
this until you have attempted recovery using those files, even if the
scripts give rc=0.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: PITR Base Backup on an idle 8.1 server

From
Greg Smith
Date:
On Tue, 5 Jun 2007, Marco Colombo wrote:

> AFAIK, files in pg_xlog are first renamed (and only if and after the
> archive_command returned true) and later overwritten to. Never deleted.

No, they get deleted sometimes, too.  Not often, but it can happen under
heavy load if more segments get temporarily created than are normally
needed.  At checkpoint time, only 2*checkpoint_segments+1 xlog files are
kept; if there are more than that, they are removed.  Probably never
happen on your system from what you've described of it, but it is a
possibility.

As Simon just pointed out, the danger with the approach you're taken comes
from what happens if a checkpoint occurs in the middle of your backup.
You've probably never seen that happen either.  As long as that continues
to be true, you might be OK for now, but you really need to get to where
you're following the recommended procedure rather than trying to do
something a little different.  There are too many edge cases here that
could^H^H^H^H^Hwill bite you one day.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: PITR Base Backup on an idle 8.1 server

From
Marco Colombo
Date:
Greg Smith wrote:
> On Tue, 5 Jun 2007, Marco Colombo wrote:
>
>> AFAIK, files in pg_xlog are first renamed (and only if and after the
>> archive_command returned true) and later overwritten to. Never deleted.
>
> No, they get deleted sometimes, too.  Not often, but it can happen under
> heavy load if more segments get temporarily created than are normally
> needed.  At checkpoint time, only 2*checkpoint_segments+1 xlog files are
> kept; if there are more than that, they are removed.  Probably never
> happen on your system from what you've described of it, but it is a
> possibility.

Ok, you're right. Anyway neither renames nor deletes can happen during
my backups. My archive_command prevents it.

> As Simon just pointed out, the danger with the approach you're taken
> comes from what happens if a checkpoint occurs in the middle of your
> backup. You've probably never seen that happen either.  As long as that
> continues to be true, you might be OK for now, but you really need to
> get to where you're following the recommended procedure rather than
> trying to do something a little different.  There are too many edge
> cases here that could^H^H^H^H^Hwill bite you one day.

Let's say you use the standard procedure. Let's say that your
archive_commands starts failing at 1:00 AM (say, no space left on the
archive directory). Let's say your backup starts at 2:00 AM.

Later, at 8:00 AM you solve the disk full problem, and PG resumes the
archiving of WAL segments... as long as there's enough room in pg_xlog
for the extra segments, PG is fine.

Is your backup broken if a checkpoint happens during the backup? In my
understanding, no. There's no need for archiving to happen immediately.

My procedure just simulates an archiving failure during the backup, no
more, no less. The only difference is that the "problem" is solved right
after the backup. By that time, all WAL records created during the
backup have already been saved.

If you're going to archive WAL segments anyway, my procedure buys you
almost nothing. It just saves you from using a trick and artificially
fill a segment with garbage data just to have it archived right after
the backup. It does so by using another trick, so no big deal. But it
allows anyone not willing to take the burden, or face the danger, of
actually archiving logs, to take file level backups instead of pg_dumps
and without relying on external snapshotting abilities.

.TM.

Re: PITR Base Backup on an idle 8.1 server

From
Marco Colombo
Date:
Simon Riggs wrote:
> On Tue, 2007-06-05 at 18:39 +0200, Marco Colombo wrote:
>> I'm asking: what _exactly_ can go wrong?
>
> If a checkpoint occurs while taking the backup then the contents of the
> files will be overwritten
   ^^^^^
Data files or WAL segments? My archive command prevents WAL segments
from being recycled during the backup.

  and you will be unable to rollforward from
> before the backup until after the backup. This will give you the FATAL
> error message "WAL ends before end time of backup dump". You won't know
> this until you have attempted recovery using those files, even if the
> scripts give rc=0.

Well, my procedure currently produces two tar achives. One is the 'base
backup' (a copy of the datafiles). One is a copy of wal segments, right
after the backup. Which one do you expect to be corrupted if a
checkpoint happens during the backup?

.TM.