Thread: Consistent state for pg_dump and pg_dumpall

Consistent state for pg_dump and pg_dumpall

From
Michael Nolan
Date:
The documentation for pg_dump says that dump files are created in a consistent state.

Is that true across multiple tables in the same pg_dump command?  (Obviously it would not be true if I dumped tables using separate pg_dump commands.  But if I put the database into a backup state using 'pg_start_backup', would separately executed pg_dump commands be in a consistent state across the set of dump files?)

The documentation for pg_dumpall does not say that its dump file is in a consistent state (eg, across all tables), but it does say that it uses pg_dump to dump clusters.  So, how consistent are the tables in pg_dumpall files? 
--
Mike Nolan



Re: Consistent state for pg_dump and pg_dumpall

From
"David G. Johnston"
Date:
Yes.  The entire dump is performed within a single transaction.

On Wed, May 20, 2015 at 9:24 AM, Michael Nolan <htfoot@gmail.com> wrote:
The documentation for pg_dump says that dump files are created in a consistent state.

Is that true across multiple tables in the same pg_dump command?  (Obviously it would not be true if I dumped tables using separate pg_dump commands. 

Yes.  The entire dump is performed within a single transaction.​

But if I put the database into a backup state using 'pg_start_backup', would separately executed pg_dump commands be in a consistent state across the set of dump files?)


​pg_start_backup and pg_dump are not designed to work together.​  Namely, pg_start_backup is mostly concerned with making sure future writes are accounted for in the final backup while pg_dump says to ignore everything that happens after the command begins.

The documentation for pg_dumpall does not say that its dump file is in a consistent state (eg, across all tables), but it does say that it uses pg_dump to dump clusters.  So, how consistent are the tables in pg_dumpall files?  

Each database is internally consistent.  There is no guarantee that databases and globals are consistent with each other (though those are typically seldom changed) but different databases will to represent the same point in time vis-a-vis each other.



You might want to describe what you are trying to do here.

David J.

Re: Consistent state for pg_dump and pg_dumpall

From
Michael Nolan
Date:


On Wed, May 20, 2015 at 12:40 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
Yes.  The entire dump is performed within a single transaction.

On Wed, May 20, 2015 at 9:24 AM, Michael Nolan <htfoot@gmail.com> wrote:
The documentation for pg_dump says that dump files are created in a consistent state.

Is that true across multiple tables in the same pg_dump command?  (Obviously it would not be true if I dumped tables using separate pg_dump commands. 

Yes.  The entire dump is performed within a single transaction.​

But if I put the database into a backup state using 'pg_start_backup', would separately executed pg_dump commands be in a consistent state across the set of dump files?)


​pg_start_backup and pg_dump are not designed to work together.​  Namely, pg_start_backup is mostly concerned with making sure future writes are accounted for in the final backup while pg_dump says to ignore everything that happens after the command begins.

The documentation for pg_dumpall does not say that its dump file is in a consistent state (eg, across all tables), but it does say that it uses pg_dump to dump clusters.  So, how consistent are the tables in pg_dumpall files?  

Each database is internally consistent.  There is no guarantee that databases and globals are consistent with each other (though those are typically seldom changed) but different databases will to represent the same point in time vis-a-vis each other.



You might want to describe what you are trying to do here.

David J.

I'm getting ready for a security audit and I want to make sure I have the database backup procedures properly documented, including what the limitations are on each type of backup .  We us a combination of low level backups with log shipping, dumping of key individual tables, dumping of entire databases and dumping the entire system (pg_dumpall.)  Hardware for setting up a slave server may be in a future budget, though I hope to be able to test having a slave server in the cloud later this year.  (I'm not sure we have enough network bandwidth for that, hence the test.)

When I moved to a new release of pg (9.3) last December, I stopped all transaction processing first so that pg_dumpall had no consistency issues. 
--
Mike Nolan

Re: Consistent state for pg_dump and pg_dumpall

From
John R Pierce
Date:
On 5/20/2015 10:44 AM, Michael Nolan wrote:

When I moved to a new release of pg (9.3) last December, I stopped all transaction processing first so that pg_dumpall had no consistency issues. 

the only possible consistency issue would be if you have applications doing 2-phase commits to two different databases on the same server, otherwise each database is dumped as a single transaction and all data elements within that database are point-in-time consistent. 

my preferred backup procedure for a whole server dump is to

    A) pg_dumpall --globals-only | gzip > ...
    B) for each database, do pg_dump -Fc -f $database.Fc.pgdump $database

I do this via the following crontab entry for hte postgres user...

$ crontab -l
30 1 * * * /var/lib/pgsql/cronbackup.sh
and this script...

#!/bin/bash
#
/usr/pgsql-9.3/bin/pg_dumpall --globals-only | gzip > /home2/backups/pgsql/pgdumpall.globals.`date +\%a`.sql.gz
for i in $(psql -tc "select datname from pg_database where not datistemplate"); do \
    pg_dump -Fc -f /home2/backups/pgsql/pgdump.$i.$(date +\%a).dump $i
        done


-- 
john r pierce, recycling bits in santa cruz