Thread: Schemas, databases, and backups
According to my reading of the pgsql documentation, the two basic backup scripts are pg_dump and pg_dumpall. pg_dump allows you to dump a single database to a file, while pg_dumpall dumps all of the databases to a single file. Currently, we use MSSQL's built-in backup facility. That allows us, with a single command, to dump every database to separate files on a daily basis (and we keep 14-days online). That makes recovering from a glitch in one of the databases very easy, and it's rather simple to go back to a particular day. Also, schemas are new to us, so I'm still thinking about how they will affect our processes and databases. (I'm betting that the ultimate answer is going to be to look for some 3rd party tool in pgFoundry.) So, now for the questions: 1) Is there a tool (or is this easily scripted in bash?) that would iterate through the databases in pgsql and dump them to individual files? I'm guessing that we would query pg_databases and dump the database names to a file (how?) and then parse that to feed to pg_dump (I can figure this bit out myself). 2) What if I wanted to dump individual schemas? Is this dangerous / not recommended? (Probably not... if I can have relationships between tables in different schemas?)
On 11/24/05, Thomas Harold <tgh@tgharold.com> wrote: > According to my reading of the pgsql documentation, the two basic backup > scripts are pg_dump and pg_dumpall. pg_dump allows you to dump a single > database to a file, while pg_dumpall dumps all of the databases to a > single file. > > Currently, we use MSSQL's built-in backup facility. That allows us, > with a single command, to dump every database to separate files on a > daily basis (and we keep 14-days online). That makes recovering from a > glitch in one of the databases very easy, and it's rather simple to go > back to a particular day. > > Also, schemas are new to us, so I'm still thinking about how they will > affect our processes and databases. > if you want to make querys that retrives info for more than one database at the same time think in use schemas. > (I'm betting that the ultimate answer is going to be to look for some > 3rd party tool in pgFoundry.) > > So, now for the questions: > > 1) Is there a tool (or is this easily scripted in bash?) that would > iterate through the databases in pgsql and dump them to individual > files? I'm guessing that we would query pg_databases and dump the > database names to a file (how?) and then parse that to feed to pg_dump > (I can figure this bit out myself). > psql -d template1 -U postgres -c "select datname from pg_databases where datname not in ('template1', 'template0', 'postgres');" | while read D; or something like that in a shell script and the simply pg_dump $D... > 2) What if I wanted to dump individual schemas? Is this dangerous / not > recommended? (Probably not... if I can have relationships between > tables in different schemas?) > dunno -- Atentamente, Jaime Casanova (DBA: DataBase Aniquilator ;)
Thomas Harold wrote: > According to my reading of the pgsql documentation, the two basic backup > scripts are pg_dump and pg_dumpall. pg_dump allows you to dump a single > database to a file, while pg_dumpall dumps all of the databases to a > single file. > > Currently, we use MSSQL's built-in backup facility. That allows us, > with a single command, to dump every database to separate files on a > daily basis (and we keep 14-days online). That makes recovering from a > glitch in one of the databases very easy, and it's rather simple to go > back to a particular day. > > Also, schemas are new to us, so I'm still thinking about how they will > affect our processes and databases. > > (I'm betting that the ultimate answer is going to be to look for some > 3rd party tool in pgFoundry.) > > So, now for the questions: > > 1) Is there a tool (or is this easily scripted in bash?) that would > iterate through the databases in pgsql and dump them to individual > files? I'm guessing that we would query pg_databases and dump the > database names to a file (how?) and then parse that to feed to pg_dump > (I can figure this bit out myself). > > 2) What if I wanted to dump individual schemas? Is this dangerous / not > recommended? (Probably not... if I can have relationships between > tables in different schemas?) > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend Its pretty trivial to code it in a bash script and I've done it for a while now. Since these are primarily command line tools, you can easily pipe the output compress the files, sort it by date and make your life a bit easier overall. I included a copy of mine, which was used to backup from another server. It will also automatically store them in monthly folders with every day inside for ease of retrieval. Its commented and I make use of variables to speed things along and to my knowledge is still setup to backup databases listed. Please note however that this script has not been used in a while, the servers it was intended to work on are no longer operational. So any passwords or addresses are of no consequence! Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org #!/bin/bash ####################################################################### # PG_BACKUP.SHL # # Created : Martin Foster # Modified : 24-Nov-2004 ####################################################################### # # Due to limitations with the PostgreSQL database backup tool not # accepting passwords on the command prompt to automate the process and # thus requiring a loose set of security. This script instead calls # the backup proceedure locally on the server and through SSH/RSH sends # the results back over the network to the appropriate workstation. # # Advantages to this system when making use of SSH is the encryption # and certificate based authentication allowing for security while # ease of use. This also runs the backup on the database server # allowing for a copy to be generated simultaneously if so desired. ################# # Data Members # # Site specific configuration BACKUP='/var/archives/postgres' # Backup path DB='ethereal' # Databases to backup CREDENTIALS='martin@io' # Username/host combination # # Date based hiearchy DAY=`date +%d` MONTH=`date +%m` YEAR=`date +%y` # # Location of utilities ARCHIVER="/usr/bin/bzip2" # Archive to compress stream DUMP="/usr/local/bin/pg_dump -cDU postgres" # Database dumping utility FILTER="/usr/bin/grep -v 'INSERT INTO param'" # Filter used with dump REMOTE="/usr/bin/ssh" # Remote access capability # # File settings PERM='0644' # Permissions to take on OWNER='root' # Owner of file GROUP='wheel' # Owner group EXT='.bz2' # Extension to use # # Verbosity VERBOSE='NO' # Display information to screen ################# # Program area # Explain what happens if [ "$VERBOSE" = "YES" ] then # Go change into the log dir if [ -d $BACKUP ] then # see if month dir exists in log dir - if not create one if [ ! -d "$BACKUP/$YEAR$MONTH" ] then # date dir does not exist in the log dir - create one /bin/mkdir "$BACKUP/$YEAR$MONTH" 2>&1 >/dev/null if [ $? -ne 0 ] then # could not create log date dir echo -e " Could not create: $BACKUP/$YEAR$MONTH - exiting!\n" exit 1 else # created dir echo -e " Created: $BACKUP/$YEAR$MONTH\n" fi fi # Change working directory cd "$BACKUP/$YEAR$MONTH" # Cycle through list for SINGLE in $DB do echo -e " Initiating backup of database: $SINGLE" $REMOTE $CREDENTIALS "$DUMP $SINGLE | $FILTER | $ARCHIVER" > $SINGLE-$MONTH$DAY$EXT echo -e " Setting permissions..." chmod $PERM $SINGLE-$MONTH$DAY$EXT echo -e " Setting permissions..." chown $OWNER $SINGLE-$MONTH$DAY$EXT echo -e " Setting permissions..." chgrp $GROUP $SINGLE-$MONTH$DAY$EXT # Verify file exists if [ -f "$SINGLE-$MONTH$DAY.bz2" ] then echo -e " Backup succesful!" else echo -e " Backup failed!" fi done else # Warning echo -e " Working directory not found: $BACKUP\n" fi # Lack of detail else # Go change into the log dir if [ -d $BACKUP ] then # see if month dir exists in log dir - if not create one if [ ! -d "$BACKUP/$YEAR$MONTH" ] then # date dir does not exist in the log dir - create one /bin/mkdir "$BACKUP/$YEAR$MONTH" 2>&1 >/dev/null if [ $? -ne 0 ] then # could not create log date dir echo -e "Could not create: $BACKUP/$YEAR$MONTH - exiting!\n" exit 1 fi fi # Change working directory cd "$BACKUP/$YEAR$MONTH" # Cycle through list for SINGLE in $DB do $REMOTE $CREDENTIALS "$DUMP $SINGLE | $FILTER | $ARCHIVER" > $SINGLE-$MONTH$DAY$EXT chmod $PERM $SINGLE-$MONTH$DAY$EXT chown $OWNER $SINGLE-$MONTH$DAY$EXT chgrp $GROUP $SINGLE-$MONTH$DAY$EXT # Verify file exists if [ ! -f "$SINGLE-$MONTH$DAY$EXT" ] then echo -e " Backup of $SINGLE failed!" fi done else # Warning echo -e " Working directory not found: $BACKUP\n" fi fi
Jaime Casanova wrote: > psql -d template1 -U postgres -c "select datname from pg_databases > where datname not in ('template1', 'template0', 'postgres');" | while > read D; > > or something like that in a shell script and the simply pg_dump $D... > I found the following snippet of code, which roughly matches yours. It was over in the Redhat mailing lists and was used to vacuum databases. # su postgres -c 'psql -t -c "select datname from pg_database order by datname;" template1' | xargs -n 1 echo template0 template1 test1 test2 After some mucking about, I came up with the following single-line shell command (suitable for adding to root's crontab). # su postgres -c 'psql -t -c "select datname from pg_database where not datistemplate and datallowconn order by datname;" template1' | xargs -n 1 -i pg_dump -Ft -b -U postgres -f /backup/postgresql/pgsql.`date +%Y%m%d.%H%M`.{}.tar {} I couldn't figure out how to add in the "not in ('template1', 'postgres', 'template1')" into the single-line shell command. It seemed to confuse the shell. Issues with the above command: 1) The date gets reevaluated for each new execution of pg_dump. Which is not necessarily ideal if you want filenames that group easily. Converting to a shell script would allow finer control. 2) The output is not compressed. I guess I could switch to using "-Fc" in conjunction with "-Z 9". Additional questions and notes: A) pg_dump takes an argument "--schema=schema", which looks like it allows me to just dump the contents of a particular schema within a database. So if I wanted, I could iterate through the list of schemas and go that route. B) There's also a "--table=table" argument, which dumps a single table. The man page for pg_dump warns me that pg_dump will not output any objects that the table depends on, so it may not be possible to restore. (The same warning applied to --schema=schema.) C) I'm not sure whether I can get away with using "where not datistemplate and datallowconn". For backing up user databases, does it matter? I can't figure out how to quote the commands properly to keep bash from getting confused. (Doubled-up quotes? Escaped quotes?) D) After more mucking, I figured out how to set a static datestamp value for the entire command and compress the tar files using gzip. I'm not sure whether I should use "export" or "set" (both worked). # export DTSTAMP=`date +%Y%m%d.%H%M` ; su postgres -c 'psql -t -c "select datname from pg_database where not datistemplate and datallowconn order by datname;" template1' | xargs -n 1 -i bash -c "pg_dump -Ft -b -U postgres {} | gzip -c > /backup/postgresql/pgsql.${DTSTAMP}.{}.tgz" Links: http://archives.postgresql.org/pgsql-general/2000-01/msg00593.php http://postgis.refractions.net/pipermail/postgis-users/2005-November/009925.html
> > # su postgres -c 'psql -t -c "select datname from pg_database where not > datistemplate and datallowconn order by datname;" template1' | xargs -n > 1 -i pg_dump -Ft -b -U postgres -f /backup/postgresql/pgsql.`date > +%Y%m%d.%H%M`.{}.tar {} > > I couldn't figure out how to add in the "not in ('template1', > 'postgres', 'template1')" into the single-line shell command. It seemed > to confuse the shell. > > Issues with the above command: > > 1) The date gets reevaluated for each new execution of pg_dump. Which > is not necessarily ideal if you want filenames that group easily. > Converting to a shell script would allow finer control. > > 2) The output is not compressed. I guess I could switch to using "-Fc" > in conjunction with "-Z 9". > > Additional questions and notes: > > A) pg_dump takes an argument "--schema=schema", which looks like it > allows me to just dump the contents of a particular schema within a > database. So if I wanted, I could iterate through the list of schemas > and go that route. > > B) There's also a "--table=table" argument, which dumps a single table. > The man page for pg_dump warns me that pg_dump will not output any > objects that the table depends on, so it may not be possible to restore. > (The same warning applied to --schema=schema.) > sure you can just like with pg_database, it's pg_namespace for schemas and pg_tables for tables... > C) I'm not sure whether I can get away with using "where not > datistemplate and datallowconn". For backing up user databases, does it > matter? I can't figure out how to quote the commands properly to keep > bash from getting confused. (Doubled-up quotes? Escaped quotes?) > "where not datistemplate and datallowconn" it's better than "datname not in (values)" > D) After more mucking, I figured out how to set a static datestamp value > for the entire command and compress the tar files using gzip. I'm not > sure whether I should use "export" or "set" (both worked). > > # export DTSTAMP=`date +%Y%m%d.%H%M` ; su postgres -c 'psql -t -c > "select datname from pg_database where not datistemplate and > datallowconn order by datname;" template1' | xargs -n 1 -i bash -c > "pg_dump -Ft -b -U postgres {} | gzip -c > > /backup/postgresql/pgsql.${DTSTAMP}.{}.tgz" > maybe it's a good idea to put all that in a script... it's getting bigger and bigger (and uglier and uglier) > Links: > http://archives.postgresql.org/pgsql-general/2000-01/msg00593.php > http://postgis.refractions.net/pipermail/postgis-users/2005-November/009925.html > -- regards, Jaime Casanova (DBA: DataBase Aniquilator ;)
On Thu, 24 Nov 2005 07:51:03 -0500, Thomas Harold wrote: > According to my reading of the pgsql documentation, the two basic backup > scripts are pg_dump and pg_dumpall. pg_dump allows you to dump a single > database to a file, while pg_dumpall dumps all of the databases to a > single file. > > Currently, we use MSSQL's built-in backup facility. That allows us, > with a single command, to dump every database to separate files on a > daily basis (and we keep 14-days online). That makes recovering from a > glitch in one of the databases very easy, and it's rather simple to go > back to a particular day. > So, now for the questions: [SNIP] > > 1) Is there a tool (or is this easily scripted in bash?) that would > iterate through the databases in pgsql and dump them to individual > files? I'm guessing that we would query pg_databases and dump the > database names to a file (how?) and then parse that to feed to pg_dump > (I can figure this bit out myself). {SNIP} The FreeBSD port installs a script called 502.pgsql that dumps each database to a separate file. It looks like it is close to what you are looking for. http://www.freebsd.org/cgi/cvsweb.cgi/ports/databases/postgresql81-server/ files/502.pgsql?rev=1.7&content-type=text/x-cvsweb-markup