Thread: Are file system level differential/incremental backups possible?

Are file system level differential/incremental backups possible?

From
Bob Hatfield
Date:
Is it possible to do a full file system level backup of the data
directory, say once a week, and differentials or incrementals daily?

I'm wondering if there are files that would normally be removed that a
restore: Full then diff/inc would not remove and perhaps
corrupt/confuse things.

Process:
Saturday: Full backup (reset archive bits) of data dir with database shutdown
Sunday: Differential (don't reset archive bits) of data dir with
database shutdown
Monday: Differential (don't reset archive bits) of data dir with
database shutdown
Wednesday: Restore to test server using Saturday's Full and Monday's
Differential.

Obviously this works for regular files/file systems; however, I'm not
sure this is a good method with postgresql as the resulting data dir
*may* (?) contain extra files (or other issues)?

Note: our database is 850GB (Windows 2008 R2 pg version 8.3.12)

Re: Are file system level differential/incremental backups possible?

From
Julien Rouhaud
Date:
As there's one file for each object, a single update on each would make you to copy the all the file again. I heard there was tool to make differentiel copy of a part of a file but I don't know if it's really efficient.

Anyway, a better way for you would be to do a regular backup (with pg_start_backup, copy and pg_stop_backup) and then use wal archive_command to keep the xlogs between 2 full backups.

On Wed, Oct 12, 2011 at 11:30 PM, Bob Hatfield <bobhatfield@gmail.com> wrote:
Is it possible to do a full file system level backup of the data
directory, say once a week, and differentials or incrementals daily?

I'm wondering if there are files that would normally be removed that a
restore: Full then diff/inc would not remove and perhaps
corrupt/confuse things.

Process:
Saturday: Full backup (reset archive bits) of data dir with database shutdown
Sunday: Differential (don't reset archive bits) of data dir with
database shutdown
Monday: Differential (don't reset archive bits) of data dir with
database shutdown
Wednesday: Restore to test server using Saturday's Full and Monday's
Differential.

Obviously this works for regular files/file systems; however, I'm not
sure this is a good method with postgresql as the resulting data dir
*may* (?) contain extra files (or other issues)?

Note: our database is 850GB (Windows 2008 R2 pg version 8.3.12)

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Are file system level differential/incremental backups possible?

From
Bob Hatfield
Date:
> Anyway, a better way for you would be to do a regular backup (with pg_start_backup, copy and pg_stop_backup) and then
usewal archive_command to keep the xlogs between 2 full backups. 

Thanks Julien.  Can pg_start/stop_backup() be used for regular full
file system backups?   All of the documentation I've read only refers
to using those for warm standby/wal shipping methods.

Re: Are file system level differential/incremental backups possible?

From
Julien Rouhaud
Date:
On Thu, Oct 13, 2011 at 12:04 AM, Bob Hatfield <bobhatfield@gmail.com> wrote:
> Anyway, a better way for you would be to do a regular backup (with pg_start_backup, copy and pg_stop_backup) and then use wal archive_command to keep the xlogs between 2 full backups.

Thanks Julien.  Can pg_start/stop_backup() be used for regular full
file system backups?   All of the documentation I've read only refers
to using those for warm standby/wal shipping methods.

Yes, and it's the only way to do a file backup without stopping the server. Careful, the command can last a while as it forces a checkpoint (see the doc for more details).

It's used for warm standby to create a copy of the server, before the wals that'll be generated can continue to restore it.

Re: Are file system level differential/incremental backups possible?

From
Alan Hodgson
Date:
On October 12, 2011 03:04:30 PM Bob Hatfield wrote:
> > Anyway, a better way for you would be to do a regular backup (with
> > pg_start_backup, copy and pg_stop_backup) and then use wal
> > archive_command to keep the xlogs between 2 full backups.
>
> Thanks Julien.  Can pg_start/stop_backup() be used for regular full
> file system backups?   All of the documentation I've read only refers
> to using those for warm standby/wal shipping methods.

The base backup necessary to initialize a warm standby server is a full file
system backup of the database, which can also be used for restores to any
point in time after the base backup is completed, assuming you also have all
the archived WAL files.

Re: Are file system level differential/incremental backups possible?

From
Bob Hatfield
Date:
> The base backup necessary to initialize a warm standby server is a full file system backup of the database, which can
alsobe used for restores to any point in time after the base backup is completed, assuming you also have all the
archivedWAL files. 

Thanks to both of  you.  I currently use the base backup technique for
use with a warm standby server but was not aware I could use that for
full file system level backups as well.

In fact, I currently run a warm standby server (created from a base
backup every several weeks) as well as stop/start the pg process to do
a full file system level copy each night.  If I think about this a
bit, I'm sure there's a more efficient way of doing this.  (As well as
may solve another problem I'm having (another post) with the standby
server's database getting corrupt after stopping/starting the
primary's pg process.)

Re: Are file system level differential/incremental backups possible?

From
"Albe Laurenz"
Date:
Bob Hatfield wrote:
> Is it possible to do a full file system level backup of the data
> directory, say once a week, and differentials or incrementals daily?
>
> I'm wondering if there are files that would normally be removed that a
> restore: Full then diff/inc would not remove and perhaps
> corrupt/confuse things.

If you drop or truncate a table between the full and the incremental
backup, will that file be "resurrected"?

Such resurrected files will not disturb PostgreSQL, but if you keep
them around, you might end up with a lot of dead files if you have to
restore a couple of times.

Yours,
Laurenz Albe

Re: Are file system level differential/incremental backups possible?

From
Bob Hatfield
Date:
> If you drop or truncate a table between the full and the incremental backup, will that file be "resurrected"?
>
> Such resurrected files will not disturb PostgreSQL, but if you keep them around, you might end up with a lot of dead
filesif you have to restore a couple of times. 

That makes sense, thank you.  At one point, I was doing delta copies
of the database: full one night, then only new/changed files the
following nights -- I am pretty sure that didn't work and thought it
was because the destination had extra files that pg had removed on the
primary.  I now use a robocopy /mir which removes files on the
destination that aren't on the primary.  This seems to work fine --
but it made me think that PostreSQL doesn't work with the former
process.

Re: Are file system level differential/incremental backups possible?

From
Craig Ringer
Date:
On 10/13/2011 05:30 AM, Bob Hatfield wrote:
> Is it possible to do a full file system level backup of the data
> directory, say once a week, and differentials or incrementals daily?

I'd love to be able to do this, but you can't do it usefully at a
file-system level. There's too much churn in the data files for even a
binary diff to be much use - and even if it were, the performance of it
would be miserable.

You *could* do a differential so long as it's a proper one that tracks
file removals as well as additions/changes. It'd be pretty pointless
though as you wouldn't save much if any storage.

I've looked into database-level diffs, but from what I can find out it
seems that PostgreSQL's MVCC system doesn't store enough information to
produce a differential dump, either. There's no way to detect and record
tuples that were deleted then vacuumed away since the last backup was
taken, so a trigger-based or WAL-based system is necessary.

I'd love a way to "collapse" or merge a set of WAL segments into a
minimal diff that only contained just enough information to get from the
start to end state of the series of WAL segments, rather than all the
churn in-between. This would be great for storing longer PITR histories
(but more coarsely) and being able to do faster restores. Unfortunately
given how scattered writes are I doubt it'd actually be possible or any
faster if it was.

SQL-level differentials would be great, though.

--
Craig Ringer

Re: Are file system level differential/incremental backups possible?

From
Bob Hatfield
Date:
>> Is it possible to do a full file system level backup of the data
>> directory, say once a week, and differentials or incrementals daily?
>
> I'd love to be able to do this, but you can't do it usefully at a
> file-system level. There's too much churn in the data files for even a
> binary diff to be much use - and even if it were, the performance of it
> would be miserable.

Actually, for us, a full backup is currently about 950GB which takes
about 24 hours and a diff several days later is only 150GB and takes
two hours.  (It takes significantly less time not only becuase it's
less data but also because the diff job doesn't have to compete with
other backup jobs (since it's faster due to less data).)

I currently stop pg, robocopy (rsync) the changes to our DRS server,
restart pg -- then backup the data on the DRS server later.  This
allows us to have our database back up within 30 minutes.

I've tested the database after doing a restore: full + diff and it
seems fine (both viewing with the application as well as running a
reindex on the entire db).  I think we'll stick with this process with
the understanding that if a disaster struck, and we needed to do a
full + diff restore, that we may have a few files that are there that
that aren't needed.

Re: Are file system level differential/incremental backups possible?

From
Craig Ringer
Date:
On 10/20/2011 02:00 AM, Bob Hatfield wrote:
>>> Is it possible to do a full file system level backup of the data
>>> directory, say once a week, and differentials or incrementals daily?
>>
>> I'd love to be able to do this, but you can't do it usefully at a
>> file-system level. There's too much churn in the data files for even a
>> binary diff to be much use - and even if it were, the performance of it
>> would be miserable.
>
> Actually, for us, a full backup is currently about 950GB which takes
> about 24 hours and a diff several days later is only 150GB and takes
> two hours.  (It takes significantly less time not only becuase it's
> less data but also because the diff job doesn't have to compete with
> other backup jobs (since it's faster due to less data).)

Wow. I'm very surprised by that, but it's good to hear. I would've
expected a much poorer result, and have never seen anything like that
good a result when I've tested with my (much smaller) data sets.

Perhaps a binary diff based backup mechanism that's aware of Pg's
structure and can skip holes, etc is worth looking into after all.

--
Craig Ringer

Re: Are file system level differential/incremental backups possible?

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


>> Is it possible to do a full file system level backup of the data
>> directory, say once a week, and differentials or incrementals daily?
...
> taken, so a trigger-based or WAL-based system is necessary.
...
> SQL-level differentials would be great, though.

FWIW, Bucardo 5 has a 'flatfile' mode that will output the deletion
and insert statements to get a table from an old state to a new one.
Not quite what is being asked for here, but could be useful for
similar things as well as for auditing.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201110221339
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk6i/9oACgkQvJuQZxSWSsjbkgCfRIIstVLsw6OeMUJZ2iL2if4Z
4hQAoIpOZXK/OseM45FAUmi8vcESpRif
=ed0H
-----END PGP SIGNATURE-----