Re: [PERFORM] Backup taking long time !!! - Mailing list pgsql-performance

From Stephen Frost
Subject Re: [PERFORM] Backup taking long time !!!
Date
Msg-id 20170122163748.GG18360@tamriel.snowman.net
Whole thread Raw
In response to Re: [PERFORM] Backup taking long time !!!  (julyanto SUTANDANG <julyanto@equnix.co.id>)
List pgsql-performance
Greetings,

* julyanto SUTANDANG (julyanto@equnix.co.id) wrote:
> Please elaborate more of what you are saying. What i am saying is based on
> the Official Docs, Forum and our own test. This is what we had to do to
> save time, both backing up and  restoring.
>
> https://www.postgresql.org/docs/9.6/static/functions-admin.html
>
> When PostgreSQL in the mode of Start Backup, PostgreSQL only writes to the
> XLOG, then you can safely rsync / copy the base data (snapshot) then later
> you can have full copy of snapshot backup data.

You are confusing two things.

After calling pg_start_backup, you can safely copy the contents of the
data directory, that is correct.

However, PostgreSQL *will* continue to write to the data directory.
That, however, is ok, because those changes will *also* be written into
the WAL and, after calling pg_start_backup(), you collect all of the
WAL using archive_command or pg_receivexlog.  With all of the WAL
which was created during the backup, PG will be able to recover from the
changes made during the backup to the data directory, but you *must*
have all of that WAL, or the backup will be inconsistent because of
those changes that were made to the data directory after
pg_start_backup() was called.

In other words, if you aren't using pg_receivexlog or archive_command,
your backups are invalid.

> if you wanted to backup in later day, you can use rsync then it will copy
> faster because rsync only copy the difference, rather than copy all the
> data.

This is *also* incorrect.  rsync, by itself, is *not* safe to use for
doing that kind of incremental backup, unless you enable checksums.  The
reason for this is that rsync has only a 1-second level granularity and
it is possible (unlikely, though it has been demonstrated) to miss
changes made to a file within that 1-second window.

> my latter explanation is: use pg_basebackup, it will do it automatically
> for you.

Yes, if you are unsure about how to perform a safe backup properly,
using pg_basebackup or one of the existing backup tools is, by far, the
best approach.  Attempting to roll your own backup system based on rsync
is not something I am comfortable recommending any more because it is
*not* simple to do correctly.

Thanks!

Stephen

Attachment

pgsql-performance by date:

Previous
From: julyanto SUTANDANG
Date:
Subject: Re: [PERFORM] Backup taking long time !!!
Next
From: julyanto SUTANDANG
Date:
Subject: Re: [PERFORM] Backup taking long time !!!