Thread: backing and restoring whole database server - how does this script look

backing and restoring whole database server - how does this script look

From
Kevin Bailey
Date:
We're trying to get a script to backup and restore a whole database server.

So far we're looking at something like

for db in `psql -U postgres -At -c 'select datname from pg_database where not datistemplate' -d template1`; do
    pg_dump -U ${PGSQL_USER} -h localhost --blobs --format=c --file="/var/backups/pg_dump_${db}.pgd" -d template1;

    psql -U statraining -h localhost --command "DROP DATABASE ${db};" template1;

    psql -U aplus2admin -h localhost --command "CREATE DATABASE ${db} WITH OWNER ${db};" template1;

    psql -U ${db} -h localhost -d ${db} -f /home/statraining/${db}_pg_backup.pgd;

done


which is obviously a rough-cut - and the main problem we have is that a client has set up a DB where the owner names do not match up with the DB names.

Or - isn't there something along the lines of:

$ pg_dumpall > all.dbs.out

(transfer file and then on other server...)

$ psql -f all.dbs.out postgres


Any pointers gratefully received.

Just to make thigs more interesting, we have to do this for a version 8.1 server and an older 7.4 server which has blobs in it.

bailey86
i had a similar (though not as complicated) question:

On Fri, Apr 10, 2009 at 5:34 AM, Kevin Bailey
<kbailey@freewayprojects.com> wrote:
> We're trying to get a script to backup and restore a whole database server.
>
> So far we're looking at something like
>
> for db in `psql -U postgres -At -c 'select datname from pg_database where
> not datistemplate' -d template1`; do
>     pg_dump -U ${PGSQL_USER} -h localhost --blobs --format=c
> --file="/var/backups/pg_dump_${db}.pgd" -d template1;
>
>     psql -U statraining -h localhost --command "DROP DATABASE ${db};"
> template1;
>
>     psql -U aplus2admin -h localhost --command "CREATE DATABASE ${db} WITH
> OWNER ${db};" template1;
>
>     psql -U ${db} -h localhost -d ${db} -f
> /home/statraining/${db}_pg_backup.pgd;
>
> done
>
>
> which is obviously a rough-cut - and the main problem we have is that a
> client has set up a DB where the owner names do not match up with the DB
> names.
>
> Or - isn't there something along the lines of:
>
> $ pg_dumpall > all.dbs.out
>
> (transfer file and then on other server...)
>
> $ psql -f all.dbs.out postgres

the sysad on my end, does something like this (backs up the entire
directory) as a weekly cronjob:

/etc/init.d/postgresql stop > /dev/null
rsync -a -H --delete /var/lib/pgsql/ /scratch/pgsql
/etc/init.d/postgresql start > /dev/null

then /scratch/pgsql is backed up over to two different machines. the
DB that i run does not need to be live 24/7, so this stopping and
starting works with us. how safe is this method? in other words, when
things get screwed, will a rsync from /scratch/pgsql to /var/lib/pgsql
save a life? i know it DIDNOT when the server was upgraded from 32-bit
to 64-bit and i was forced to recreate the whole DB!!


--
Regards
PK
--------------------------------------
http://counter.li.org  #402424

Re: backing and restoring whole database server - how does this script look

From
Peter Koczan
Date:
On Fri, Apr 10, 2009 at 4:34 AM, Kevin Bailey
<kbailey@freewayprojects.com> wrote:
> We're trying to get a script to backup and restore a whole database server.
>
> So far we're looking at something like
> ...
> which is obviously a rough-cut - and the main problem we have is that a
> client has set up a DB where the owner names do not match up with the DB
> names.

You should consider adding a few different flags. Here's what I do to
dump/restore a specific database.

pg_dump -h [oldhost] -p 5432 -C -Fc [database] | pg_restore -h
[newhost] -p 5432 -C -d template1

The -C flag in pg_dump adds a command to create the database in the
dumpfile (and connect to it immediately afterward). -Fc dumps the
output in a custom, compressed format, but that output isn't readable
by psql.

Then, the -C flag in pg_restore says to read and execute that create
command we dumped, but we have to give it a different database to
initially connect to.

Of course, depending on how you have your network and pg_hba.conf set
up, you might not be able to use a direct pipe. If you can't, then
you'll have to add the steps of dumping the output to a file and
transferring said file before restoring. No big deal, I'm sure you can
interpolate for what you need.

> Just to make thigs more interesting, we have to do this for a version 8.1
> server and an older 7.4 server which has blobs in it.

If I remember correctly, pg_dumpall only allows text format dumps, and
blobs need to be dumped in non-text formats. This might have changed
since 7.4, but I couldn't find any source either way.

Hope this helps.

Peter

Peter Koczan <pjkoczan@gmail.com> writes:
> On Fri, Apr 10, 2009 at 4:34 AM, Kevin Bailey
> <kbailey@freewayprojects.com> wrote:
>> Just to make thigs more interesting, we have to do this for a version 8.1
>> server and an older 7.4 server which has blobs in it.

> If I remember correctly, pg_dumpall only allows text format dumps, and
> blobs need to be dumped in non-text formats. This might have changed
> since 7.4, but I couldn't find any source either way.

The 8.1 release notes quoth:

    Allow pg_dump to dump large objects even in text mode (Tom)

    With this change, large objects are now always dumped; the former -b switch is a no-op.

so in that version and up you can use pg_dumpall without worrying about
blobs.  The upthread proposal of separately dumping each database in
-Fc mode is probably the only solution in 7.4, but you ought to be
thinking about how to get off 7.4 not how to perpetuate it.  We'll be
dropping support for 7.4 RSN.

            regards, tom lane

Re: backing and restoring whole database server - how does this script look

From
Kevin Bailey
Date:


Tom Lane wrote:
Peter Koczan <pjkoczan@gmail.com> writes: 
On Fri, Apr 10, 2009 at 4:34 AM, Kevin Bailey
<kbailey@freewayprojects.com> wrote:   
Just to make thigs more interesting, we have to do this for a version 8.1
server and an older 7.4 server which has blobs in it.     
 
If I remember correctly, pg_dumpall only allows text format dumps, and
blobs need to be dumped in non-text formats. This might have changed
since 7.4, but I couldn't find any source either way.   
The 8.1 release notes quoth:
Allow pg_dump to dump large objects even in text mode (Tom)
With this change, large objects are now always dumped; the former -b switch is a no-op.

so in that version and up you can use pg_dumpall without worrying about
blobs.  The upthread proposal of separately dumping each database in
-Fc mode is probably the only solution in 7.4, but you ought to be
thinking about how to get off 7.4 not how to perpetuate it.  We'll be
dropping support for 7.4 RSN.
		regards, tom lane
 
This is still giving out plenty of error messages - BTW - both servers are on 8.1

time PG_dumpall --inserts --clean -h $REMOTE -U admin | psql -U postgres template1


Will reply with the error messages after the cron job has run in an hour or so.

Regards,

bailey86

Can a whole server be restored over to another server in a single command?

From
Kevin Bailey
Date:
Can a whole server be restored over to another server in a single command?

Or would it always need some sort of loop.

I've tried

pg_dumpall --inserts --clean -h $REMOTE -U admin | psql -U postgres
template1

but this produces errors such as

ERROR:  role "admin" cannot be dropped because some objects depend on it
DETAIL:  owner of database test1
owner of database test
7 objects in database test1
201 objects in database test
ERROR:  role "admin" already exists
ERROR:  current user cannot be dropped
ERROR:  role "postgres" already exists
ERROR:  role "sql-ledger" cannot be dropped because some objects depend on it
DETAIL:  owner of database bean-demo
owner of database bean-cash
owner of database bean-41
owner of database bean-40
owner of database bean-39
owner of database bean-37
owner of database bean-36
owner of database bean-35


We're also looking at:

PG_DBs=(`for pg_db in /var/backups/pg_dump_*.pgd ; do basename ${pg_db}
.pgd | cut -f 3 -d\_ ; done`)

for index in $(seq 0 $((${#PG_DBs[@]} - 1)))
 do
  PG_SRC=${PG_DBs[$index]};
  echo;
  echo backing up PgSQL database file ${PG_SRC} to ${REMOTE_SRV};
  psql -U admin -h ${REMOTE_SRV} --command \"DROP DATABASE ${PG_SRC}\"
template1;
  psql -U admin -h ${REMOTE_SRV} --command \"CREATE DATABASE ${PG_SRC}\"
template1;
  pg_restore -U admin -h ${REMOTE_SRV} --dbname=template1
/var/backups/pg_dump_${PG_SRC}.pgd;
done

Although I feel this can be simplified with using

psql -l -A -t




So - single command or loop?

And if we're using the loop how do we make sure that the ownership of
everything is OK?

Thanks,

Kevin Bailey

The problem is that we'd like to repeatedly carry out this action - so
existing objects are causing errors.
>

Maybe there can be a combination?   Say, completely dropping all
databases on the receiving server before reloading from the backup file?

Kev

How's this for a quick and dirty replication plan?

We need to replicate a PG on one server to another server repeatedly.
The issues coming up relate to:

 * Using 'clean' to be able to clean out the data has a problem cos
roles can not be dropped due to dependencies.
 * We'd like to use pg_dumpall cos it recreates DB ownership and all
functions etc.

We could loop through the database names and drop/recreate and then
reload the DB using pg_dump - but this won't create any new users which
may have been created on the DB.  Also, we have to be careful not to
recreate the postgres, template0 and template1 databases.



So - is there any reason we can't?

Reinitialise the Secondary DB as if it is a new DB.  By this I mean drop
all data and all databases.
Reload the entire DB from the output of pg_dumpall.

Comments gratefully received.

Kev

Kevin Bailey <kbailey@freewayprojects.com> writes:
> Can a whole server be restored over to another server in a single command?

The output of pg_dumpall is really meant to be loaded into an empty
server.  Although it has a "--clean" switch, that's got serious
limitations and I wouldn't recommend bothering with it.

What exactly is the scenario you are trying to have here?  Is this
a poor man's substitute for replication?  Do you want the destination
server to include other stuff besides what is coming from the source?

            regards, tom lane

Tom Lane wrote:
Kevin Bailey <kbailey@freewayprojects.com> writes: 
Can a whole server be restored over to another server in a single command?   
The output of pg_dumpall is really meant to be loaded into an empty
server.  Although it has a "--clean" switch, that's got serious
limitations and I wouldn't recommend bothering with it.

What exactly is the scenario you are trying to have here?  Is this
a poor man's substitute for replication? 

Yes.  Poor man's replication is what we're after!    :o)

We have a server which is used by a client who provides web based financial accounts applications.  The application is always being developed/extended and new databases are being added all the time.

What we're trying to provide is a secondary server which is synchronised every night from the primary server.  This would be used if there was a major problem with the primary server.  Effectively, this is a step up from rebuilding a whole application server from backup data.  I.e. the Secondary(replacement) server is already built, online, application/libraries installed and the data is up-to-date as per the previous night.

(Possibly, if the budget is available in the future we may recommend a clustering type solution).

 Do you want the destination
server to include other stuff besides what is coming from the source?
 
The destination server (we call the Secondary server) only needs to be a copy of the Primary (live) server.


The current plan is:

On a Postgresql database server the two lowest level objects seem to be users(/roles) and databases.  Now, the Secondary server already has the DB user accounts created.

So what we're going to do is to concentrate on dropping then recreating the databases on the Secondary server and then filling them with data from the Primary databases.

We have a bash loop which can list the databases, owners and collation on the Primary server.

for db_info in $(psql -U postgres -h ${remote} -l -A -t)
  do

  echo ${db_info}
done

Which gives us output something like

bean-21|sql-ledger|UTF8
bean-22|sql-ledger|UTF8
bean-23|sql-ledger|UTF8
bean-24|sql-ledger|UTF8

(We will put in code so the databases postgres, template0 and template1 will be ignored).

Now, within that loop we can get the three values for database, owner and collation using IFS type stuff.

For each database in the list we will:

1. DROP the database on the Secondary server which was created in the previous run - if it exists.
2. CREATE a new (empty) database with the relevant owner and collation.
3. Use pg_dump to extract data from the Primary DB and then feed this into the newly created Secondary database.

(We will possibly be a bit cleverer and dump the databases to files first and transfer them to try to make the process a bit more atomic).

Now, if a needed user account is missing from the Secondary we should get an error message returned from the cron job and we'll manually create the user.

We could be even cleverer and use psql to get a list of users from the primary and make sure they are present on the secondary but in our case there are only four users and they hardly ever change.

When I've finished the code I'll post it to the list.  The other part I'd like to figure out is how to get it to be as atomic as possible.

Cheers,

Kevin

			regards, tom lane
 




Kevin Bailey <kbailey@freewayprojects.com> writes:
> Tom Lane wrote:
>> What exactly is the scenario you are trying to have here?  Is this
>> a poor man's substitute for replication?

> Yes.  Poor man's replication is what we're after!    :o)

>> Do you want the destination
>> server to include other stuff besides what is coming from the source?

> The destination server (we call the Secondary server) only needs to be a
> copy of the Primary (live) server.

Well, the easiest solution is just to re-initdb the secondary and load
pg_dumpall output into it.  Is there anything you expect to gain by
re-inventing portions of pg_dumpall?

            regards, tom lane


The destination server (we call the Secondary server) only needs to be a 
copy of the Primary (live) server.   
Well, the easiest solution is just to re-initdb the secondary and load
pg_dumpall output into it.

We're using Debian and have servers on Etch and Lenny.

I've never carried out initialisation before and wasn't sure if I wasn't getting into something I don't know enough about.  AFAIK it should be something like;

Stop DB
Delete the contents of the 'datadir'
Run initdb something, something
Start DB

I wasn't sure if the Debian installer hadn't set things up in some special way.

  Is there anything you expect to gain by
re-inventing portions of pg_dumpall?
 

The primary reason is that if you run the output from pg_dumpall --clean into a DB which already has previous copies of the databases then there are errors.  Primarily, it can't drop a ROLE because existing objects depend on it and errors are raised.

We want our systems guy to monitor the output from the synchronisation script and they are not DB developers.  Therefore, we need output which doesn't show ERRORs.  Normally, when they see ERROR's they would raise it with us DB guys.

Currently I feel that the sync script I want is within my capabilities - the re-init stuff would need investigation and testing.  I'll get on with the script and post up the result.  There are parts which others may find useful.

BTW - I've been using PG for years and really appreciate the hard work put in by you guys.  We've used it in all kinds of scenarios.  One of our clients was particularly happy about a Debian/PG box which replaced their flakey Access DB share and ran perfectly for years until they asked us to replace the whole application with a web app.

Cheers,

Kevin

			regards, tom lane
 
>
> Currently I feel that the sync script I want is within my capabilities
> - the re-init stuff would need investigation and testing.  I'll get on
> with the script and post up the result.  There are parts which others
> may find useful.
>

Here's the script in it's current format.  After all the work on it it
has become fairly concise.  A couple of points first.

1. I suppose that to be a proper sync it should delete all user
databases on the *Secondary* server.

2. It wouldn't be too difficult to have a section first which would make
sure the Secondary server has matching user accounts.

# Here we loop through a list of databases - the flags cut out most
formatting from the database list produced.
for db_name in $(psql -U admin -h ${REMOTE} -A -t --command "SELECT
datname FROM pg_catalog.pg_database ORDER BY datname;" postgres)
  do

  if [ $db_name != 'postgres' -a $db_name != 'template0' -a $db_name !=
'template1' ]; then

       echo
       echo `date`
       echo "Dumping out ${db_name}..."
       pg_dump -U admin -h ${REMOTE} --create ${db_name} >
/var/backups/${db_name}.pgd

       echo
       echo `date`
       echo "Dropping database ${db} and reloading it..."
       psql -U admin -h localhost --command "DROP DATABASE
\"${db_name}\";" postgres
       psql -U admin -h localhost -f /var/backups/${db_name}.pgd
postgres > /dev/null
  fi

done


Hope it's of use to others.

BTW - This is what I've used when the database needs to be copied to a
different DB on the same server with a different owner.


echo
echo `date`
echo "Dumping database stalive..."
pg_dump -U stalive -h localhost --no-owner stalive >
/home/statraining/stalive_pg_backup.pgd

echo
echo `date`
echo "Dropping database statraining..."
psql -U statraining -h localhost --command "DROP DATABASE statraining;"
template1

echo
echo `date`
echo "Recreating database statraining..."
psql -U aplus2admin -h localhost --command "CREATE DATABASE statraining
WITH OWNER statraining;" template1

echo
echo `date`
echo "Restoring database statraining from stalive backup..."
psql -U statraining -h localhost -d statraining -f
/home/statraining/stalive_pg_backup.pgd


Kev

Lost my postgres database?

From
Kevin Bailey
Date:
Hi,

Just looking into an old 7.4 based server and the '\l' command isn't
showing a database called 'postgres' - template0 and template1 are there.

Is this expected on 7.4?  I've read the docs and can't see any mention
of this DB.  Should this DB exist?

Cheers,

Kevin

Re: Lost my postgres database?

From
Tom Lane
Date:
Kevin Bailey <kbailey@freewayprojects.com> writes:
> Just looking into an old 7.4 based server and the '\l' command isn't
> showing a database called 'postgres' - template0 and template1 are there.

> Is this expected on 7.4?

Yes.  We didn't start creating a 'postgres' database by default until
8.1 or thereabouts.

            regards, tom lane

Re: Lost my postgres database?

From
Scott Marlowe
Date:
On Sun, Apr 12, 2009 at 6:08 PM, Kevin Bailey
<kbailey@freewayprojects.com> wrote:
> Hi,
>
> Just looking into an old 7.4 based server and the '\l' command isn't showing
> a database called 'postgres' - template0 and template1 are there.
>
> Is this expected on 7.4?  I've read the docs and can't see any mention of
> this DB.  Should this DB exist?

Yep.  template1 used to be the primary database for what the purpose
the postgres database now serves.  You only get the template0/1
databases in 7.4

Re: Lost my postgres database?

From
Kevin Bailey
Date:


Tom Lane wrote:
Kevin Bailey <kbailey@freewayprojects.com> writes: 
Just looking into an old 7.4 based server and the '\l' command isn't 
showing a database called 'postgres' - template0 and template1 are there.   
 
Is this expected on 7.4?   
Yes.  We didn't start creating a 'postgres' database by default until
8.1 or thereabouts.
		regards, tom lane
 
Thanks for confirming that - your earlier point RE 7.4 being dropped soon was noted and we'll put plans into place to migrate the server.

Cheers,

Kevin