Re: Updating large postgresql database with blobs - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: Updating large postgresql database with blobs
Date
Msg-id 45F57030.1010704@dunslane.net
Whole thread Raw
In response to Updating large postgresql database with blobs  ("CAJ CAJ" <pguser@gmail.com>)
Responses Re: Updating large postgresql database with blobs
List pgsql-hackers
CAJ CAJ wrote:
> Hello,
>
> I didn't get any response on the GENERAL list so i'm escalating this ....
>
> We have several independent database servers with ~50GB+ databases 
> running postgres 8.0.x. We are planning to upgrade these databases to 
> postgres 8.2.x over the weekend
>
> We plan to use the following steps to upgrade each server,
>
> 1. Dump the 8.0.x database cluster using 8.2.x pg_dumpall
> % ./pg_dumpall > pgdumpall_backup.sql
>
> 2.Dump the 8.0.x database  including large objects in  compressed 
> custom format using 8.2.x pg_dump
> % ./pg_dump -Fc -b -Z9 dbname > pgdump_lobs_backup
>
>
> Restoring database
> 1. Initialize 8.2.x darabase
> % initdb -D /data/pgdata
>
> 2. Restore template1 database from cluster dump
> % ./psql -d template1 < pgdumpall_backup.sql
>
> 3. Delete database dbname else restoring will give error about 
> existing dbname
> % dropdb dbname
>
> 4. Create fresh dbname
> % createdb -O dbowner dbname
>
> 5. Restore database with lobs
> % ./pg_restore -v -Fc -d dbname -e -U dbowner < pgdumpall_lobs_backup
>
> Some of the problems we have are,
> 1. We are not sure if all of the data will be available after 
> dump/restore with above process
> 2. The dump and restore process is very very slow to be complete over 
> the weekend (takes approx 1GB/hr to dump on a dual G5 PPC 2Ghz with 
> 1GB RAM and RAID 1 disks)
>
> What is the fastest way to upgrade postgres for large databases that 
> has binary objects?
>
> Thanks for all your help. 


Your procedure dumps and restore the databases twice. This seems less 
than sound. My prediction is that you could get a 50% speed improvement 
by fixing that ...

The only thing you really need pg_dumpall for is the global tables. I 
would just use pg_dumpall -g to get those, and then use pg_dump -F c  + 
pg_restore for each actual database.

Another thing is to make sure that pg_dump/pg_restore are not competing 
with postgres for access to the same disk(s). One way to do that is to 
run them from a different machine - they don't have to be run on the 
server machine - of course then the network can become a bottleneck, so 
YMMV.

cheers

andrew



pgsql-hackers by date:

Previous
From: Robert Treat
Date:
Subject: Re: Auto creation of Partitions
Next
From: "Luke Lonergan"
Date:
Subject: Re: Synchronized Scan update