Thread: Backup options?

Backup options?

From
Magnus Persson
Date:
I have found Barman to be adequate for my requirements (though still looking for alternatives). It does what it says on the box and thats fine. At the moment I have ~25 clusters, each with ~25 databases containing 200 schemas (multitenancy thing). The clusters run off of a synchronously replicated SAN, but I dont think that will affect read ops. I do have the option to put the dumps on a non-replicated LUN.

What implications does a call to pg_dumpall have on my databases? Is there even a remote chance that a lock could be started by pg_dumpall?

If we consider that I for some reason or the other can't use pg_dumpall against the production clusters, what are my options? One idea is that of using asynchronous replication and pull the dumps off of them. Are there any pitfalls related to the replication? During normal operations, will postgres ensure that the state on the slaves always reflect the state of the masters? In effect it would work similar to if I did the dump on the production servers? I recall reading something about asynchronous replication, but I'm unsure of what it was exactly or if it affects backups.

Re: Backup options?

From
John DeSoi
Date:
On Sep 15, 2014, at 2:06 PM, Magnus Persson <magnus.e.persson@gmail.com> wrote:

> If we consider that I for some reason or the other can't use pg_dumpall against the production clusters, what are my
options?One idea is that of using asynchronous replication and pull the dumps off of them. Are there any pitfalls
relatedto the replication? During normal operations, will postgres ensure that the state on the slaves always reflect
thestate of the masters? In effect it would work similar to if I did the dump on the production servers? I recall
readingsomething about asynchronous replication, but I'm unsure of what it was exactly or if it affects backups. 

Yes, this approach works, but it is not without possible problems. You may have to repeat a failed backup because of
queryconflicts. See 

http://www.postgresql.org/docs/current/static/hot-standby.html#HOT-STANDBY-CONFLICT

John DeSoi, Ph.D.




Re: Backup options?

From
Albe Laurenz
Date:
Magnus Persson wrote:

> What implications does a call to pg_dumpall have on my databases?

It will cause load (mostly I/O).
It will start long-running transaction that can cause autovacuum to lag behind
on cleaning up busy tables, leading to database bloat.

> Is there even a remote chance that a lock could be started by pg_dumpall?

The only locks are on tables, and they only block ALTER/DROP TABLE.

> If we consider that I for some reason or the other can't use pg_dumpall against the production
> clusters, what are my options? One idea is that of using asynchronous replication and pull the dumps
> off of them. Are there any pitfalls related to the replication?

Yes; unless configured properly, the dump may fail because of conflicts.
You'd have to use hot_standby_feedback=on, and that again can cause bloat
on the master cluster as described above.

> During normal operations, will
> postgres ensure that the state on the slaves always reflect the state of the masters? In effect it
> would work similar to if I did the dump on the production servers? I recall reading something about
> asynchronous replication, but I'm unsure of what it was exactly or if it affects backups.

The standby database is identical to the master database, but it may be
a little behind.  That is no problem at all for a backup.
Don't use synchronous replication for this purpose.

Yours,
Laurenz Albe