Thread: Best strategy to perform individual incremental backups

Best strategy to perform individual incremental backups

From
"Mario Diangelo"
Date:
Hello you all,

We use different Windows servers with Postgresql and looking for the best strategy to create daily incremental backups to a (managed) volume. We want to achieve this with scripts and run them daily within the TaskSchedular. Point in time recovery seems to work only for the whole cluster but we want to be able to restore just one database on a cluster witch contains more databases.

It seems to start with WAL files and a full backup is necessary. We don’t want to shutdown a production server so a full backup with pg_basebackup seems to be the way (i could be wrong). We want to minimize the chance of dataloss so archivemode will be set to on (for production environment).

Postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'copy "%p" "D:\\backupfolder\\walfiles\\%f"' # Windows

Does anyone have a (powershell) script that we can use to create the incremental backups on Windows and can point us the right settings to achieve this? Do you keep the first full backup, do you replace them once a week/month and then delete the wall files? I’m wondering how other PostgreSQL on Windows admins perform this. Is a third party tool the only way to perfom such a backup? For the record, at the time being a daily full backup is executed for all individual databases. 

For Oracle (rman) we use a Powershell script that backups the database (incremental) with archive logs and delete the local archive files after successful backup and also delete files older then …..The backup server SLA keeps the backups in archive.

Thanks in advance.

Greetings,

Mario

Re: Best strategy to perform individual incremental backups

From
Stephen Frost
Date:
Greetings,

* Mario Diangelo (watisditnou@msn.com) wrote:
> We use different Windows servers with Postgresql and looking for the best strategy to create daily incremental
backupsto a (managed) volume. We want to achieve this with scripts and run them daily within the TaskSchedular. Point
intime recovery seems to work only for the whole cluster but we want to be able to restore just one database on a
clusterwitch contains more databases. 

Restoring of file-based backups of any kind (PITR or just regularly
restoring a backup) requires restoring the entire cluster.  It's not
possible to restore an individual database using that method.  If you
want to do per-database restores, you'd need to use pg_dump to export
the data and then pg_restore to restore it (note that this takes a lot
more time, generally, since indexes have to be rebuilt, constraints have
to be re-validated, etc).

> It seems to start with WAL files and a full backup is necessary. We don’t want to shutdown a production server so a
fullbackup with pg_basebackup seems to be the way (i could be wrong). We want to minimize the chance of dataloss so
archivemodewill be set to on (for production environment). 

pg_basebackup is a decent tool on Windows for doing cluster backups.  It
does not support per-database backup or restore since that isn't
something that PostgreSQL really supports.  If you want to be able to
independently backup/restore things in a given cluster, the best answer
really is to put those into a different cluster instead, which also
means you'll get PITR too, which is certainly a good thing to have.

> Postgresql.conf
> wal_level = replica
> archive_mode = on
> archive_command = 'copy "%p" "D:\\backupfolder\\walfiles\\%f"' # Windows

This is actually a really bad approach when it comes to an archive
command because 'copy' wasn't written with the considerations of
PostgreSQL in mind.  While the documentation does say that the above is
an example and not a recommendation, it really doesn't go into the
details of why that is and it starts with copy not (afaik) sync'ing the
copied file to actual disk before returning success, meaning that a
crash will leave you with missing WAL files and no way to perform PITR
through those.  You could possibly get around that with xcopy /j, but
there's other problems- PG will, for various reasons, archive the same
file multiple times, which you could maybe get around with xcopy /j /w,
but the /j doesn't sync the metadata afaik, which might be an issue, and
it definitely doesn't check to see if the file being copied is identical
to the one already in the archive (which generally shouldn't happen
anyway ... unless you've misconfigured something).

There's also no validation that the WAL file being copied is going into
the correct repository location if you've got many PostgreSQL clusters
and that can lead to corruption in the repository, among a host of other
just really good checks to have to make sure backups and archiving are
all working properly.

> Does anyone have a (powershell) script that we can use to create the incremental backups on Windows and can point us
theright settings to achieve this? Do you keep the first full backup, do you replace them once a week/month and then
deletethe wall files? I’m wondering how other PostgreSQL on Windows admins perform this. Is a third party tool the only
wayto perfom such a backup? For the record, at the time being a daily full backup is executed for all individual
databases. 

pg_basebackup and pg_receivewal can be used to do backup on Windows and
to do WAL archiving in a better way than using the archive_command and
copy, though that doesn't give you an incremental backup option
currently (perhaps one day in the future it will... though pg_basebackup
is still a very long way from a proper backup solution for PG).  Perhaps
with PITR, you won't really need incremental backups though?  Depends a
lot on how active your cluster is.  The bigger and more active it is,
frankly, the more you might want to simply consider moving it off of
Windows.

Thanks,

Stephen

Attachment