Re: Best strategy to perform individual incremental backups - Mailing list pgsql-general

From Stephen Frost
Subject Re: Best strategy to perform individual incremental backups
Date
Msg-id ZNolE04gkx5F1PEg@tamriel.snowman.net
Whole thread Raw
In response to Best strategy to perform individual incremental backups  ("Mario Diangelo" <watisditnou@msn.com>)
List pgsql-general
Greetings,

* Mario Diangelo (watisditnou@msn.com) wrote:
> We use different Windows servers with Postgresql and looking for the best strategy to create daily incremental
backupsto a (managed) volume. We want to achieve this with scripts and run them daily within the TaskSchedular. Point
intime recovery seems to work only for the whole cluster but we want to be able to restore just one database on a
clusterwitch contains more databases. 

Restoring of file-based backups of any kind (PITR or just regularly
restoring a backup) requires restoring the entire cluster.  It's not
possible to restore an individual database using that method.  If you
want to do per-database restores, you'd need to use pg_dump to export
the data and then pg_restore to restore it (note that this takes a lot
more time, generally, since indexes have to be rebuilt, constraints have
to be re-validated, etc).

> It seems to start with WAL files and a full backup is necessary. We don’t want to shutdown a production server so a
fullbackup with pg_basebackup seems to be the way (i could be wrong). We want to minimize the chance of dataloss so
archivemodewill be set to on (for production environment). 

pg_basebackup is a decent tool on Windows for doing cluster backups.  It
does not support per-database backup or restore since that isn't
something that PostgreSQL really supports.  If you want to be able to
independently backup/restore things in a given cluster, the best answer
really is to put those into a different cluster instead, which also
means you'll get PITR too, which is certainly a good thing to have.

> Postgresql.conf
> wal_level = replica
> archive_mode = on
> archive_command = 'copy "%p" "D:\\backupfolder\\walfiles\\%f"' # Windows

This is actually a really bad approach when it comes to an archive
command because 'copy' wasn't written with the considerations of
PostgreSQL in mind.  While the documentation does say that the above is
an example and not a recommendation, it really doesn't go into the
details of why that is and it starts with copy not (afaik) sync'ing the
copied file to actual disk before returning success, meaning that a
crash will leave you with missing WAL files and no way to perform PITR
through those.  You could possibly get around that with xcopy /j, but
there's other problems- PG will, for various reasons, archive the same
file multiple times, which you could maybe get around with xcopy /j /w,
but the /j doesn't sync the metadata afaik, which might be an issue, and
it definitely doesn't check to see if the file being copied is identical
to the one already in the archive (which generally shouldn't happen
anyway ... unless you've misconfigured something).

There's also no validation that the WAL file being copied is going into
the correct repository location if you've got many PostgreSQL clusters
and that can lead to corruption in the repository, among a host of other
just really good checks to have to make sure backups and archiving are
all working properly.

> Does anyone have a (powershell) script that we can use to create the incremental backups on Windows and can point us
theright settings to achieve this? Do you keep the first full backup, do you replace them once a week/month and then
deletethe wall files? I’m wondering how other PostgreSQL on Windows admins perform this. Is a third party tool the only
wayto perfom such a backup? For the record, at the time being a daily full backup is executed for all individual
databases. 

pg_basebackup and pg_receivewal can be used to do backup on Windows and
to do WAL archiving in a better way than using the archive_command and
copy, though that doesn't give you an incremental backup option
currently (perhaps one day in the future it will... though pg_basebackup
is still a very long way from a proper backup solution for PG).  Perhaps
with PITR, you won't really need incremental backups though?  Depends a
lot on how active your cluster is.  The bigger and more active it is,
frankly, the more you might want to simply consider moving it off of
Windows.

Thanks,

Stephen

Attachment

pgsql-general by date:

Previous
From: Andreas Kretschmer
Date:
Subject: Re: Fatal Error : Invalid Memory alloc request size 1236252631
Next
From: David Gilman
Date:
Subject: Query plan regression between CTE and views